PowerPivot from Identical Excel Files

You can use the PowerPivot add-in for Excel 2010 to create a report from multiple Excel workbooks or worksheets, by joining the tables using the Primary and the Foreign key, such as ‘ProductID’ in a Sales table and a Pricing table.

In this example though, we want to combine the data in two Excel files that have an identical structure — sales data for the East and West regions. In this case, we can’t use a key to connect the tables; instead, we want to create one combined table from all the data. The following technique allows you to import more than a million records from Excel, despite the fact that one worksheet can only contain up to 1,048,576 rows. At least that’s possible in theory — on my computer it imported about 1.2 million, then gave up, after whining about memory resources.

Thanks to Excel MVP, Kirill Lapin, for sharing this very helpful tip with us. You can see more of Kirill’s work in last week’s posts on Combining Data from Two Excel Files in a Pivot Table.

Create a Connection in the Workbook

The key to this technique is to start by creating a workbook connection, before you launch PowerPivot.

  1. On the Excel Ribbon’s Data tab, click Connections.
  2. In the Workbook Connections window, click Add
  3. At the bottom of the Existing Connections window, click Browse for More.
  4. Navigate to the folder where your files are located.
  5. Select one of the files that you want to import — EastSales.xlsx in this example — and click Open.
  6. Select a table to import, and click OK.
  7. The new connection appears in the Workbook Connections window.

powerpivotunion05

Combine the Data in PowerPivot

  1. Close the Workbook Connections window, and on the Ribbon, click the PowerPivot tab.
  2. Click PowerPivot Window, to launch the PowerPivot add-in.

powerpivotunion06

Note: If you’re using Windows XP, the PowerPivot window has a menu bar. If you’re using Vista or Windows 7, you’ll see a Ribbon instead.

  1. On the Table menu, click Existing Connections, or, on the Ribbon, click Design, then Existing Connections.
  2. At the bottom of the Existing Connections window, under Workbook Connections, click on the connection that you added, and click Open.
  3. In the Table Import Wizard, click Next, then select the table, and click Finish
  4. After the data is successfully imported, click Close.

powerpivotunion10

Change the SQL Statement

Now that the first table has been imported, you can change its properties, to combine it with data from the second table.

  1. On the Table menu, click Table Properties, or on the Ribbon, click the Design tab, then click Table Properties.
  2. At the right, from the Switch To drop down list, select Query Editor.
  3. Edit the SQL statement, to create a union query, combining the two tables. In this example, the SQL statement is:

SELECT [EastSales$].* FROM [EastSales$] UNION ALL SELECT * FROM ‘C:\_TESTWestSales.xlsx’.[WestSales$]

After you change the SQL statement, click the Validate button, to verify that the statement is correct, then click Save.

powerpivotunion13

Note: The SQL query string can also be edited in the Excel workbook connection window, by selecting the connection, and clicking Properties. However, there’s no Validate feature there.

Create the Pivot Table

Next, you can create a pivot table from the combined data.

  1. On the Toolbar, click the Create a PivotTable button, or on the Ribbon, click the Home tab, then click PivotTable.
  2. Select a location for the pivot table, and click OK.
  3. Add  fields to the pivot table layout, to see a summary of the data.

Here’s the pivot table that was created from the combined data, with columns for the East and West regions. The Report Layout is Tabular, and Number format is used, with thousands separator and zero decimals.

powerpivotunion17

Detailed Instruction and Sample Files

To see detailed instructions for this technique, with more screen shots, visit the PowerPivot from Identical Structure Excel Files page on the Contextures website. That page also has a link for downloading the East and West sales data that I used in this example.

Watch the PowerPivot Video

To see the steps for combining data from multiple tables in PowerPivot, please watch this PowerPivot from Identical Excel Files video tutorial.

Download the PowerPivot Add-In

You can download the free PowerPivot add-in from the Microsoft website: PowerPivot Download

__________

You may also like...

22 Responses

  1. Kirill Lapin (KL) says:

    Now, here comes some interesting stuff :) I’ve done some quick’n’dirty testing.
    For exactly the same data (1.048.469 records) from different sources the data import went as follows:

    Source Max Memory Used Time to Load
    TXT 157 MB 0:00:30
    OLEDB connection to XLSX 475 MB 0:01:28
    XLSX 830 MB 0:02:03

    This may explain why Debra ran out of memory when she attempted to load more than 1.200.000 records :)

  2. Kirill Lapin (KL) says:

    You need to add here the ~140 MB of memory that XL takes up on open, plus ~40 MB for the PowerPivot window :)

  3. Ivan says:

    Hi,

    The site on PowerPivot from Identical Excel Files shows the steps to combine TWO similar files.

    Q: What are the steps for combining THREE OR MORE similar files? I got stuck at the section on Change the SQL Statement.

    The below SQL Statement does not work for me in my Excel 2010.

    SELECT [DATA$].* FROM ‘C:20110413CB1.xlsx’.[DATA$]
    UNION ALL
    SELECT [DATA$].* FROM ‘C:20110413CB2.xlsx’.[DATA$]
    UNION ALL
    SELECT [DATA$].* FROM ‘C:20110413CB3.xlsx’.[DATA$]

    Please help me. Thank you.

    Regards,
    Ivan

  4. Kirill Lapin (KL) says:

    Hi Ivan,

    The error is in the SELECT clause. Between SELECT and FROM you must either list the field names to extract, e.g.:

    SELECT [FieldName1],[FieldName2],[FieldName…] FROM ‘C:20110413CB1.xlsx’.[DATA$]

    or an asterisk (*) to indicate that all fields need to be extracted, e.g.:

    SELECT * FROM ‘C:20110413CB1.xlsx’.[DATA$]

    Regards
    Kirill

  5. Michael M says:

    I’m trying this solution on two different access DB (one from July and one from August) which have the same table layout but I can’t get it to work.
    I get an SQL errormessage saying “no colums specified”

    Heres my sql string:

    SELECT [tblDataIn August 2011].*
    FROM [tblDataIn August 2011]

    (The above is the sql generated in power pivot)

    Union ALL
    Select *
    From ‘C:UsersxxxDocuments7 July DB 2011.accdb’.[tblDataIn Juli 2011$]

  6. Michael M says:

    Correct path, typo in above message.
    From ‘C:UsersxxxDocuments7 July DB 2011.accdb’.[tblDataIn Juli 2011$]

  7. Thierry says:

    for three or more excel sheets, you can use following formula that seems to be the only one to work:

    SELECT [Sheet1$].*
    FROM [Sheet1$]

    UNION ALL
    SELECT * FROM `C:\$link trials\2008 short example trial.xlsx`.[sheet2$]

    UNION ALL
    SELECT * FROM `C:\$link trials\2008 short example trial.xlsx`.[sheet3$]

    Now, this works with data that can fit on an excel sheet.
    Does anybody know how I can do this with extreme huge data which is to big to be put in an excelsheet of excel 2010?
    In combination of SQL?

    Best regards,

    Thierry

    • Glen says:

      I tried doing this with comma delmited text files but it won’t allow me to access the Switch to drop down list and select query editor.

      Was anyone able to address Thierry’s question relating to do this with huge data files greater than 1.1 million rows (e.g. comma delimited Text Files with >2million rows)?

  8. Kaori K says:

    SELECT [‘Dist1$’].*
    FROM [‘Data$’]
    UNION ALL
    SELECT *FROM `C:\Master\Dist1.xlsx`.[Data$]

    The 4 lines above worked fine. But if I tried to add in another line, it no longer works. Does anyone know how can I make this work?

    UNION ALL
    SELECT *FROM `C:\Master\Dist2.xlsx`.[Data2$]

    When I try to validate, the following error pops up
    “The SQL statement is not valid. There are no columns detected in the statement.’

  9. Tone says:

    SELECT [dbo].[v_carbonreportingv2].* FROM [dbo].[v_carbonreportingv2]

    UNION ALL
    SELECT * FROM `C:\TEST\BEImportfilCarbonReporting_V2.xlsx`.[‘Import to database$’]

    Incorrect syntax near ‘`’.

    I’m trying to combine an excel worksheet with an sql server table but only gets syntax error. When I try to combine to excel worksheet I don’t get the syntax error. Is it not possible to combine excel worksheet and a database table?

    I have the same columns in the worksheet as I have in the table, also same formatting…

  10. Tim Windsir says:

    I was having issues with the name of the file, which started by producing error: “The SQL statement is not valid. There are no columns detected in the statement” for this statement:-

    SELECT [‘Prime ref prompt – Parties (1)$’].* FROM [‘Prime ref prompt – Parties (1)$’]
    UNION ALL
    SELECT * FROM ‘C:\CData\temp file.xlsx’.[data$]

    So I (eventually) deleted the first two lines leaving:-

    SELECT * FROM ‘C:\CData\temp file.xlsx’.[data$]

    then I git the following error: ”C:\CData\temp file.xlsx” is not a valid name. Make sure that it does not include invalid characters or punctuation and that it is not too long.

    What!?!? I thought. Then it occurred to me, It’s a SQL statement, use [] for containing names. This now works :-)

    SELECT [‘Prime ref prompt – Parties (1)$’].* FROM [‘Prime ref prompt – Parties (1)$’]
    UNION ALL
    SELECT * FROM [C:\CData\temp file.xlsx].[data$]

    Also only use [‘ ‘] if the worksheet name contains spaces….. hope this helps someone else!

  11. Todd says:

    Tim – the [] did the trick. Thanks for posting.

  12. Neil says:

    I kept having the same trouble as the others here “The SQL statement is not valid. There are no columns detected in the statement.”

    The issue boiled down to not having the same number of columns in the three datasets I was using.

    Maybe of help to someone out there..

    • Thanks Neil! And sometimes there are columns that look empty, but have something in them that Excel sees as part of the used range.
      Press Ctrl+End to see where the worksheet really ends.

  13. Prashant Kumar Gupta says:

    Hi All,

    I tried the above and it worked with with multiple workbooks. But whenever I add any new column in the source workbook, it does not reflect in the power pivot table . Can anyone tell me why and how to make the changes work in power pivot.

  14. Jeroen says:

    Anyone know how to do this with CSV files that reside in different directories (but are otherwise identical)?
    Thanks!

  15. Jean-Paul says:

    Hi All,
    Thanks Debra, the procedure work fine for me (consolidate identical sheets in Excel 2013)
    When pivot table is created, I am not able to group because the command is grayed ?
    Note : I have done the same procedure with excel 2007 and the group command work fine
    Can anyone tell me why
    Best regards

  16. stephane says:

    SELECT [FACTIT$].* FROM [FACTIT$]
    UNION ALL
    SELECT * FROM `Z:\xxxxxxxxxx\AVANTAGE TO EXCEL\CD.xlsx`.[FACTIT$]
    UNION ALL
    SELECT * FROM `Z:\xxxxxxxxxxx\AVANTAGE TO EXCEL\AVA USA-CA.xlsx`.[FACTIT$]

    This sql statement works for 3 files from same structure excel files. but on another table on those same file

    SELECT [FACTMA$].* FROM [FACTMA$]
    UNION ALL
    SELECT * FROM `Z:\xxxxxxxxxx\AVANTAGE TO EXCEL\CD.xlsx`.[FACTMA$]
    UNION ALL
    SELECT * FROM `Z:\xxxxxxxxxxx\AVANTAGE TO EXCEL\AVA USA-CA.xlsx`.[FACTMA$]

    It gives me the “The SQL statement is not valid. There are no columns detected in the statement.”

    it works on one table but not on the other. ??

  17. Christopher Yap says:

    Hi,

    When I click valid and it was fine but when running the following SQL
    SELECT [‘For Outlook(C1)$’].* FROM [‘For Outlook(C1)$’] where ([Userid] “”)
    UNION ALL
    SELECT [‘For Outlook(RO)$’].* FROM [‘For Outlook(RO)$’]
    UNION ALL
    SELECT * FROM `\\team.eworkplace.tas.com@SSL\DavWWWRoot\subentity\ShSvc\Documents\Estate Management\Database\Occupancy database\Database_DTTA_Occupancy.xlsm`.[‘For Outlook$’]

    I encounter error

    the last one is from another worksheet residing in SharePoint document library

    OLE DB or ODBC error: Multiple-step OLE DB operation generated errors. Check each OLE DB status value, if available. No work was done..
    The current operation was cancelled because another operation in the transaction failed.

  18. Chris says:

    I have the same problem with getting this error message all of the sudden.

    I basically have the same setup from multiple Excel files and the reload has been working for six months but all of the sudden I started getting this error message:

    OLE DB or ODBC error: Multiple-step OLE DB operation generated errors. Check each OLE DB status value, if available. No work was done..
    The current operation was cancelled because another operation in the transaction failed.

    Anyone have a solution?

Leave a Reply to stephane Cancel reply

Your email address will not be published. Required fields are marked *