EASILY Bulk Load files into a table with different columns

with NO FORMAT FILE!

I consider this a gem just hiding on the MSDN docs :

Using BULK IMPORT on a View

The following example creates the v_myTestSkipCol on the myTestSkipCol table. This view skips the second table column, Col2. The example then uses BULK INSERT to import the myTestSkipCol2.dat data file into this view.

In the SQL Server Management Studio Query Editor, execute the following code:

CREATE VIEW v_myTestSkipCol AS
SELECT Col1,Col3
FROM myTestSkipCol;
GO

USE AdventureWorks2008R2;
GO
BULK INSERT v_myTestSkipCol
FROM ‘C:myTestSkipCol2.dat’
WITH (FORMATFILE=’C:myTestSkipCol2.xml’);
GO

Say what? You can just write a view and not deal with a nasty format file? This is useful any time you have an additional column you want on your staging table, but it’s not in the file you’re importing, and want to add it as part of your processing. Either its a flag on where the file came from, the file name, the date loaded etc…

Lets pretend, as part of a larger ETL process I just want to import names of people that belong to different teams.

  • I get a file daily for each team with new members
  • I want to transform the data for all teams as one operation after bulk-loading each team file (preferably into the same staging table)
  • Each file has the same name, I get it from an API based on the team name so they look no different when downloaded
-- person staging table
Create Table StagePerson(
    first varchar(50) NULL,
    first varchar(50) NULL,
    team varchar(50) NULL
)

-- a view to match whats in the file
CREATE View vImportStagePerson as
 select first, last
 from person
go

-- my ETL process will download each file, then bulk load into the table
-- ETL loop: for each team
BULK INSERT vImportStagePerson
FROM 'C:[team]members.csv'

UPDATE StagePerson
Set team = [team]
where team is null

-- end for each team

-- do your merge of all new team members from your staging table WITH team added

This way I save additional hops just to add that pesky team name to my staging table. And your final operation can take advantage of a larger set per-transaction.

Leave a Reply

Your email address will not be published. Required fields are marked *