peltier tech utilities
Learn how to create Excel dashboards.

Categories

30 Excel Functions in 30 Days

Archives

PowerPivot from Identical Excel Files

You can use the PowerPivot add-in for Excel 2010 to create a report from multiple Excel workbooks or worksheets, by joining the tables using the Primary and the Foreign key, such as 'ProductID' in a Sales table and a Pricing table.


In this example though, we want to combine the data in two Excel files that have an identical structure -- sales data for the East and West regions. In this case, we can't use a key to connect the tables; instead, we want to create one combined table from all the data. The following technique allows you to import more than a million records from Excel, despite the fact that one worksheet can only contain up to 1,048,576 rows. At least that's possible in theory -- on my computer it imported about 1.2 million, then gave up, after whining about memory resources.


Thanks to Excel MVP, Kirill Lapin, for sharing this very helpful tip with us. You can see more of Kirill's work in last week's posts on Combining Data from Two Excel Files in a Pivot Table.


Create a Connection in the Workbook


The key to this technique is to start by creating a workbook connection, before you launch PowerPivot.



  1. On the Excel Ribbon's Data tab, click Connections.
  2. In the Workbook Connections window, click Add
  3. At the bottom of the Existing Connections window, click Browse for More.
  4. Navigate to the folder where your files are located.
  5. Select one of the files that you want to import -- EastSales.xlsx in this example -- and click Open.
  6. Select a table to import, and click OK.
  7. The new connection appears in the Workbook Connections window.

powerpivotunion05


Combine the Data in PowerPivot



  1. Close the Workbook Connections window, and on the Ribbon, click the PowerPivot tab.
  2. Click PowerPivot Window, to launch the PowerPivot add-in.

powerpivotunion06


Note: If you're using Windows XP, the PowerPivot window has a menu bar. If you're using Vista or Windows 7, you'll see a Ribbon instead.



  1. On the Table menu, click Existing Connections, or, on the Ribbon, click Design, then Existing Connections.
  2. At the bottom of the Existing Connections window, under Workbook Connections, click on the connection that you added, and click Open.
  3. In the Table Import Wizard, click Next, then select the table, and click Finish
  4. After the data is successfully imported, click Close.

powerpivotunion10


Change the SQL Statement


Now that the first table has been imported, you can change its properties, to combine it with data from the second table.



  1. On the Table menu, click Table Properties, or on the Ribbon, click the Design tab, then click Table Properties.
  2. At the right, from the Switch To drop down list, select Query Editor.
  3. Edit the SQL statement, to create a union query, combining the two tables. In this example, the SQL statement is:

SELECT [EastSales$].* FROM [EastSales$] UNION ALL SELECT * FROM 'C:\_TESTWestSales.xlsx'.[WestSales$]


After you change the SQL statement, click the Validate button, to verify that the statement is correct, then click Save.


powerpivotunion13


Note: The SQL query string can also be edited in the Excel workbook connection window, by selecting the connection, and clicking Properties. However, there's no Validate feature there.


Create the Pivot Table


Next, you can create a pivot table from the combined data.



  1. On the Toolbar, click the Create a PivotTable button, or on the Ribbon, click the Home tab, then click PivotTable.
  2. Select a location for the pivot table, and click OK.
  3. Add  fields to the pivot table layout, to see a summary of the data.

Here's the pivot table that was created from the combined data, with columns for the East and West regions. The Report Layout is Tabular, and Number format is used, with thousands separator and zero decimals.


powerpivotunion17


Detailed Instruction and Sample Files


To see detailed instructions for this technique, with more screen shots, visit the PowerPivot from Identical Structure Excel Files page on the Contextures website. That page also has a link for downloading the East and West sales data that I used in this example.


Watch the PowerPivot Video


To see the steps for combining data from multiple tables in PowerPivot, please watch this PowerPivot from Identical Excel Files video tutorial.



Download the PowerPivot Add-In


You can download the free PowerPivot add-in from the Microsoft website: PowerPivot Download


__________

Related Posts Plugin for WordPress, Blogger...

8 comments to PowerPivot from Identical Excel Files

  • Kirill Lapin (KL)

    Now, here comes some interesting stuff :) I've done some quick'n'dirty testing.
    For exactly the same data (1.048.469 records) from different sources the data import went as follows:

    Source Max Memory Used Time to Load
    TXT 157 MB 0:00:30
    OLEDB connection to XLSX 475 MB 0:01:28
    XLSX 830 MB 0:02:03

    This may explain why Debra ran out of memory when she attempted to load more than 1.200.000 records :)

  • Kirill Lapin (KL)

    You need to add here the ~140 MB of memory that XL takes up on open, plus ~40 MB for the PowerPivot window :)

  • Ivan

    Hi,

    The site on PowerPivot from Identical Excel Files shows the steps to combine TWO similar files.

    Q: What are the steps for combining THREE OR MORE similar files? I got stuck at the section on Change the SQL Statement.

    The below SQL Statement does not work for me in my Excel 2010.

    SELECT [DATA$].* FROM 'C:20110413CB1.xlsx'.[DATA$]
    UNION ALL
    SELECT [DATA$].* FROM 'C:20110413CB2.xlsx'.[DATA$]
    UNION ALL
    SELECT [DATA$].* FROM 'C:20110413CB3.xlsx'.[DATA$]

    Please help me. Thank you.

    Regards,
    Ivan

  • Kirill Lapin (KL)

    Hi Ivan,

    The error is in the SELECT clause. Between SELECT and FROM you must either list the field names to extract, e.g.:

    SELECT [FieldName1],[FieldName2],[FieldName...] FROM 'C:20110413CB1.xlsx'.[DATA$]

    or an asterisk (*) to indicate that all fields need to be extracted, e.g.:

    SELECT * FROM 'C:20110413CB1.xlsx'.[DATA$]

    Regards
    Kirill

  • I'm trying this solution on two different access DB (one from July and one from August) which have the same table layout but I can't get it to work.
    I get an SQL errormessage saying "no colums specified"

    Heres my sql string:

    SELECT [tblDataIn August 2011].*
    FROM [tblDataIn August 2011]

    (The above is the sql generated in power pivot)

    Union ALL
    Select *
    From 'C:UsersxxxDocuments7 July DB 2011.accdb'.[tblDataIn Juli 2011$]

  • Correct path, typo in above message.
    From 'C:UsersxxxDocuments7 July DB 2011.accdb'.[tblDataIn Juli 2011$]

  • for three or more excel sheets, you can use following formula that seems to be the only one to work:

    SELECT [Sheet1$].*
    FROM [Sheet1$]

    UNION ALL
    SELECT * FROM `C:\$link trials\2008 short example trial.xlsx`.[sheet2$]

    UNION ALL
    SELECT * FROM `C:\$link trials\2008 short example trial.xlsx`.[sheet3$]

    Now, this works with data that can fit on an excel sheet.
    Does anybody know how I can do this with extreme huge data which is to big to be put in an excelsheet of excel 2010?
    In combination of SQL?

    Best regards,

    Thierry

Leave a Reply

  

  

  

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>