JKP Application Development Services.

                    Microsoft Office Application Development

Worksheet Data

Up • Worksheet Data • Form Controls

•  •

Search Zoeken

WWW
This site

Go to Ga Naar
Home
Up

Donate Donaties
Did you find something helpful on my site? Consider a donation!
Heeft u iets gevonden waar u wat aan had? Overweeg dan een donatie!

 

The New Excel 2007 File Format

Working With Worksheet Data In An Excel 2007 File

In this chapter I'll explain how you can read and write to a cell on a worksheet by editing the xml inside an Excel file's internals directly.

Finding the proper worksheet

First a tip about how to get at the content. An Excel 2007 file is nothing more than a zip container which holds a couple of folders and files. You can view those simply by adding .zip behind the filename. Then you right-click the file and select Explore.

The XML package in an Excel 2007 file is structured like this:


Contents of an Excel xlsm file.

In this structure, we need to dive into the “xl” folder, which (in the sample file I created) contains:


Contents of "xl" folder

The file Workbook.xml contains this section which is relevant to this chapter:


Part of the XML in "Workbook.xml"

Let's assume we want to add something to the worksheet “Comments”. We must find which xml file contains its data. Key element here is the r:id part in the xml shown above, the r:id of worksheet "Comments" equals "rId7".

In the folder xl\_rels there is a file called “Workbook.xml.rels”, which contains this snippet of xml:


Section in "Workbook.xml.rels" which shows us which file belongs to our worksheet

As can be seen, the worksheet "Comments" (rId7) is represented by a file named "sheet7.xml", which is one of the files listed in the folder "worksheets":


folder xl\worksheets showing its contents

Reading a cell's value

The worksheet “Comments” contains this:


Content of worksheet "Comments" (cell B2 contains a cell comment)

Which in turn is reflected in sheet7.xml as:


XML in sheet7.xml (irrelevant part collapsed) showing cells A1, A2 and A1

Suppose we’re interested in the content of cell A1. How do we find that content?

Because it says: <c r="A1" t="s">

We know that cell A1 contains a string (t="s"). The next line: <v>28</v> tells us, we need to find item number 28 inside the file called “xl/SharedStrings.xml”

The top of that file looks like this:

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" count="41" uniqueCount="29">

So there are 29 unique strings in this workbook. Since the index starts at zero and the sheet7.xml says we need index # 28, we need the last "<si>" entry in that XML file:

Not very hard!

Adding Text to a cell

Now suppose we want to add a new text entry to a row in sheet “Comments”?

Let’s also assume the text is not yet included in the SharedStrings.xml file.

First we modify the string count at the top of that file. Change:

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" count="41" uniqueCount="29">

to:

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" count="42" uniqueCount="30">

Now we can add our new string to this file. At the very bottom of the file insert:


Addition of a new string in the sharedStrings.xml file

Done with sharedStrings.xml. Now to the changes in sheet7.xml.

Here is the relevant part of the sheet7.xml file again:

<row r="1" spans="1:2">
  <c r="A1" s="33" t="s">
    <v>28</v>
  </c>
</row>
<row r="2" spans="1:2">
  <c r="A2">
    <v>12</v>
  </c>
</row>
<row r="3" spans="1:2">
  <c r="A3" s="34">
    <v>39218</v>
  </c>
</row>

To add a row with our new string, we add the bold part:

<row r="1" spans="1:2">
  <c r="A1" s="33" t="s">
    <v>28</v>
  </c>
</row>
<row r="2" spans="1:2">
  <c r="A2">
    <v>12</v>
  </c>
</row>
<row r="3" spans="1:2">
  <c r="A3" s="34">
    <v>39218</v>
  </c>
</row>
<row r="4" spans="1:2">
  <c r="A4" s="33" t="s">
    <v>29</v>
  </c>
</row>

Done!! Here is what the sheet looks like when opened in Excel:

The result of our xml editing displayed in Excel

Adding numbers to a cell

Adding a number to a cell works similar to adding text, but with two differences: The numbers are kept in the Sheet?.xml itself (no reference to sharedStrings.xml) and you can add a number format.

A cell with number format “general” and no special formatting applied is represented by this xml in sheet7.xml:

<row r="5" spans="1:2">
  <c r="A5">
    <v>12</v>
  </c>
</row>

Cell formats (styles) are referred to by an index number, like this (showing the date in cell A3 now):

<row r="3" spans="1:2">
  <c r="A3" s="34">
    <v>39218</v>
  </c>
</row>

This happens to be a date style.

The 34 index points to another file within the package: Styles.xml This file starts out with a list of custom number formats. Then further down, a cell formats cross reference is shown, called cellXfs, which -if applicable- refers back to the custom number format list. The s=”34” within Sheet?.xml refers to one of the elements within this node:

Another zero based list. So number 34 is the 35th item (and last) of that list:

This cell format cross reference in turn references the numFormatId’s list, the FontId’s list, the fillId’s list and the borderId’s list, which are all contained within Styles.xml.

As you can see, adding formatted cells to a worksheet isn’t that straightforward. You will have to check whether your specific cell formatting combination is already in the cell formatting cross reference (in Styles.xml), which involves checking all individual formatting references for their respective elements.

If your format combination is new, you’ll have to extend any formatting elements’ list with the new format and update the style crossreference. Of course all “count” arguments of these lists need to be updated as well.

Download

I have made the file used in this article available for download:

Excel2007FileFormatDemo.zip

Feedback

Since you have managed to reach the end of this article, maybe you would care to write me a small message, expressing your opinion on this tool?

Click here to write an email message to me.

You may also rate this article:

(Rated: 308 times, average rating: 5.9)

Comments

All comments about this page:


Comment by: XL-Dennis (5/18/2007 2:38:15 AM)

J K,

Check out the nifty tool XML Notepad from MSFT: http://www.microsoft.com/downloads/details.aspx?familyid=72d6aa49-787d-4118-ba5f-4f30fe913628&displaylang=en

Kind regards,

Dennis


Comment by: Ken Puls (5/22/2007 8:38:20 AM)

Very nice disection, Jan Karel! Quite interesting. :)


Comment by: Jan Karel Pieterse (5/22/2007 10:08:21 AM)

Hi Ken,

Thanks!


Comment by: Mangesh (8/3/2007 4:51:54 AM)

Hi, The page is quite helpful, I would like to remove Home tab from Excel 2007, How to do that, if possible please reply.

regards

-Mangesh


Comment by: Jan Karel Pieterse (8/3/2007 5:10:02 AM)

Hi Mangesh,

See this page for information about modifying the ribbon:

pschmid.net


Comment by: Frank Diaz (9/19/2007 10:35:06 AM)

I realize I am In the wrong location but I can not find the right one to list my problem. The is my problem. When I open excel I get a message that say's Compile error in hidden module: AutoExeNew. Also when I call I get a message that say's compile error in hidden module:DistMon. I am yusng Eaxcel 7000. My old excel worksheets using Excel 2000 have been affected when converted to Excel 20007. I have no protected files.

Can you help.


Add a comment too!!!

Please enter your name (required, will be shown):

Your E-Mail address (optional; will not be shown, nor be used to send you unsolicited information):

Your comments on this page (will be shown):

    Subscribe in a readerpowered by longhead.com

Use the contact page to issue questions or comments about this website.
Copyright © 2003-2008 JKP Application Development Services