See also :
How to import CSV data to SQL server
How to Import Excel spreadsheet to SQL Server database
How to Validate and Import from Multiple Excel Worksheets
How to Import Excel spreadsheet using Stored Procedure
In this tutorial we will help you complete the task of validating importing data from Multiple Microsoft Excel Workbooks.
One single task will handle importing multiple Excel Workbooks with varying filename from a single folder/directory.
We will demonstrate how to configure a task to complete this goal, and you will able to quickly repeat the process time and time again.
|If you need help with this task, drop us an email to Support@LeanSoftware.net, with an example spreadsheet, and we may well be able to help you.
In this first example we show you how to import to a table or update a table. I another tutorial we will show you how to load and verify the data using a stored procedure.
Setting up the task is a do it once process - EDT saves all your task settings into a task XML file. The next time you want to repeat the process you simply select the Task from the main EDT interface.
Be aware before we start - What are the pitfalls of importing Excel data into SQL Server?
Often companies experience intermittent data import failures when trying to import Excel data.
Your database administrator will be quick to tell you there are going to be problems if you plan to do this regularly.
So how can we make this a secure and reliable process? How can the EDT software help you?
Validation, validation, validation..
- With a visual interface that allows you to correct the data
- Automatic validation highlighting any problems on a per-row basis.
- Validation is not just data type validation - but validation with the destination database itself - this being wrapped in a transaction that is always undone on any validation problem.
- Any number of validation rules can be added.
- Data can be sent to a stored procedure for more advanced validation, the procedure can return validation error messages.
How does EDT compare to other import techniques?
Compared to any other import method EDT gives a huge advantage, and provides a highly reliable import method.
The reason companies have problems importing Excel data is the fact that users can modify data formats or make other changes that render the data or file invalid. EDT efficiently overcomes all these Excel import problems improving business efficiency and data accuracy.
Worked example - Import Multiple Excel Workbooks to SQL Server Table with a 'Select workbook' parameter
Here we assume you have already downloaded the EDT software (you can run this example even with the trial version, please see download page).
In this example we will load, modify, validate and upload from three sample workbooks.
We will see how EDT can help you to easily :
- highlight and reject rows with missing essential data
- allow the user select the Source workbook from a drop down list on task load
- allow for the fact that source column names may differ in each workbook
- after successful validate and send - we will automatically move the workbook to a processed directory for storage/archive
Example source data
We will import from 3 different Excel Workbooks, with a similar layout.
Data from Melbourne office
Data from New York office
Data from London office
STEP 1 - Create New Task
Click the New option to create a new EDT Task.
Enter a task name, then click the Insert or Update table with data from any source.
STEP 2 - Connecting to SQL Server
In this example we will use one of the Native SQL Server drivers:
Enter the essential connection details only :
- Name of your SQL Server
- Connection method
- Select Database name
- Test the connection
|Note that you can also use a Stored Procedure for the destination, this can often be a more flexible solution.
(Notice the Connection string has been populated for you..)
STEP 3 - Select destination table
- NOTE : The chosen destination partly determines the layout of columns in the task
- EDT can thought of as 'using Excel as a Workspace', the layout of the workspace is determined by the data Destination,
- the data displayed (from the data source) is overlayed on top of the destination layout and is determined on the 'Data source / SQL' configuration tab.
- A task does not have to have data source, data can be pasted into a task from the clipboard if you wish, but generally it is much better to use the data source tab as data can be imported from any source.
- Using the 'Destination columns / SQL tab screen' as shown above you can:
- Define task column layout using drag and drop of the column name
- Hide unwanted columns
- Exclude unwanted columns from the SQL
- Insert what we call 'Work columns' - a user defined column that can be used for several purposes - a calculated value for example.
- Apply validation and formatting rules can be applied (data type validation is automatic)
- Define display formatting for example dates and money amounts
- Create 'SQL lookup' columns that can retrieve existing values from the database that match the imported data
- And more!
We will stick to the basics for the purpose of this tutorial.
STEP 4 - Connecting to the first Workbook using the ACE driver
Paste connection string method
You can simply paste in the following connection string, and modify the path of the Excel workbook:
Provider=Microsoft.ACE.OLEDB.12.0;Data Source=\\hut8\EDTShares\Excel Import Data\ClientList_Melbourne.xls;Extended Properties="Excel 8.0;HDR=YES;IMEX=1";Persist Security Info=False
|Notice we are using a network share for this example, SQL Server will need read access to this folder to complete this example.
EDT Wizard method
|In this example we use the Ace/Jet driver as we this allows the Excel filename to be specified in the connections string.
To learn more about the various connection strings/drivers you can use to connect to an Excel workbook, please see help sectionSource Connection Strings
You can also create the connection string using the wizard, click the New button:
Select the ACE driver, click Next
If no suitable driver is listed, obtain the ACE driver - see download links on the Excel Connection Strings page.
Enter the path to the Excel file, click 'Test'
Modify the Extended Properties and enter the following:
Select provider Microsoft OLE DB Provider for ODBC Drivers, click Next..
Click OK to return to EDT:
As shown in the above screenshot you will now see now how easy and flexible it is to select data from the Spreadsheet Worksheets using EDT !
- You can select a specific worksheet using the Object Source dropdown
- SQL is generated for you and by default all columns are included
- If you click the 'Custom' SQL radial option - you can manually edit the SQL as you require
- You could modify the SQL to only select a sub set of the data for example by adding a WHERE clause
- The SQL can be modified to only select certain columns form the worksheet
- Add blank columns if you wish by using the NULL keyword eg SELECT [WebCustomerID], NULL, [Email]..
- Click the 'Test' button to check the SQL syntax and view the first 100 records
- You can click the Map Columns button to reorder the select list using drag and drop - showing you a mapping between the destination and source columns
- You can insert Task Parameters using the parameter insert button this allows you to load data matching user parameters - such as a category or date range (User parameters are created using the Parameters tab)
STEP 5 - Making the SQL Generic and Independent of column names
In our example workbooks, although the columns are in the same order, the heading have different names.
Therefore the SQL we use must be generic and not specify column names.
To do this, modify the connection string and change the HDR=YES to HDR=NO
The Ace driver will then disregard the column headings.
Click Test, and you will that the EDT generated SQL has changed to the generic format SELECT [F1], [F2].. etc
No matter what the column headers are, EDT will pull out the data without any problems.
STEP 6 - Map Columns
Click the Map Columns button
In the example above shown above we have initially aligned the columns that match exactly, these being highlighted in green.
You can drag and drop to alter the position/order of both the source and destination columns.
We now need to exclude the unwanted columns, this will complete in step 9 and 10.
The source has a First name and Last name, but the destination only has a Name field, this we will deal with in Step 11.
STEP 7 - Excluding unwanted Source columns
Click the custom SQL option and remove the unwanted source columns from the SQL:
Click Test SQL to verify the syntax and view the data that will be returned by the query:
STEP 8 - Excluding unwanted Destination columns
On the Destination tab for unwanted columns,
- uncheck the Mandatory option,
- uncheck the Include in SQL option
- uncheck the Visible option
Notice that the SQL on the right hand panel is automatically updated to include only the columns required.
| Note that if you wish to update existing data - please see Step 19.
STEP 9 - Custom SQL to concatenate First and Last name fields
To concatenate the first and last name into one column we simply modify the Source SQL to append two fields using the ampersand operator , separating with a space:
STEP 10 - Test the Task
Save the task and return to the main EDT interface, and we will see the data loaded and ready to be imported.
STEP 11 - Remove unwanted rows
Often a spreadsheet will contain unwanted rows, these might be blank rows or row that contain subtotals for example.
As you can see in the above screenshot our data contains some blank rows at the end, and also the unwanted column headers (as we have used the option HDR=NO in the connection string).
These we will remove using the very useful SQL function ISNUMERIC() field to check for a value in column in the spreadsheet.
Returning to the task you will see the effect of removing the unwanted header rows and unwanted blank rows:
STEP 12 - Add data validation rules
There are several ways that EDT handles data validation within your task - ensuring the data is of good quality and good news for your business!
- Data type validation (automatic)
- Mandatory / must have values
- One or more specific validation rules by column using the power of Excel Data validation
- Validation by batch - specifying that all rows rows must be valid before the Send button operation is used
- Validation using fixed lists or SQL lookup Lists
- Server validation - automatic by use of the Validation button
- Additional server validation - applicable when sending data to a stored procedure
In this example we will add three validation rules:
- Email address must be supplied
- Name bust be more than two characters long
- Country must match one of our existing entries the County table, and the allow the user to correct the data by selecting from a drop down list
Marking an item as Mandatory
This is very easy - just click the Mandatory checkbox next to the data item name:
When the data is loaded into the task, any missing mandatory data is highlighted.
Notice that the Active Legend Red status shows a count of two rows . You can use the legend up/down arrows to very quickly locate the rows in each status band.
The left hand column shows the user what is wrong, and the problem cell has a red border:-
Using the power of Excel Data Validation!
Here we will add a validation rule specifying the Name value must be more than three characters:
Additionally, on the Error Alert tab, enter an error description for user feedback:
Notice that multiple validation messages appear when on row 5 of this example, as two column values are invalid:
Validation with Lists
In our example we are importing a Country value, but it must match with a valid country name in an existing Country table on SQL server.
We can easily use EDT sql lookup column capability to validate these values.
Click the SQL list button on the Country row item :
Choose options as shown below to create a Lookup column:
The option shown at the bottom of the dialogue Show validation error of column value not in list must be ticked for lookup validation to take place.
Returning to EDT as you can see below, the missing Country is highlighted:
The user can simply click on the country cell and select the correct country name from the drop down list!
STEP 13 - Creating the File Select task parameter
The main topic of tutorial is how to allow the task user to select from a number of existing Excel files.
To do this we first first create an EDT Task Parameter.
From the Parameters Tab - click Add Parameter:
- Enter a parameter name
- Enter your user prompt text
- Click the the Lookup column icon
We will now create a lookup that will list files matching a filename pattern, we will use an SQL server function to do this.
Set SQL Lookup options as follows:
Paste the following SQL into the SQL Code text box:
SET NOCOUNT ON
IF OBJECT_ID('tempdb..#TempFileList') IS NOT NULL
DROP TABLE #TempFileList
CREATE TABLE #TempFileList(
INSERT INTO #TempFileList(FileName, Depth, IsFile)
EXEC xp_dirtree '\\HUT8\EDTShares\Excel Import Data', 10, 1
SELECT FILENAME FROM #TempFileList WHERE FileName LIKE '%.xls'
The SQl function xp_dirtree is used to obtain the list of files in the specified folder.
Change the highlighted folder name to a folder name on your system
If a user requires access to the files, then a folder on an existing or new network share should be used.
The folder must be accessible to SQL Server and SQL server must have read permissions on the directory, or the SQL function will return an empty list.
Here are the options to set in the SQL lookup dialogue:
On task load the File selection is now presented to the user :
STEP 14 - Insert the Task parameter into the Source Connection String
We wish the selected filename to be used in the data load connection string.
You see now how this solution is coming together quite nicely!
In the task configuration Data Source tab,
- Edit the connection string, removing the filename (not the whole path, just the filename).
- Place the cursor at the position where the filename should be.
- Click the EDT Merge codes button
- Select the Filename Merge code
- Click Insert
Remove any quote marks from around the merge code, which should now look very similar to this example :