Managing Pricing and Inventory (Sort of) Automatically for LocationStore with XML and Excel

While working at online GPS retailer LocationStore (part of Dearman Ventures), I was tasked with updating the pricing on products. This would require reviewing the prices from various wholesalers to ensure we were offering the best prices. This originally was a manual task that was very time consuming.

To resolve this issue, a combination of web services, CSV-to-XML processing via PHP, and Microsoft Excel were used, resulting in an automatically updating spreadsheet. This spreadsheet would then be used to update the pricing of products on the LocationStore web site.

Getting from Point A to Point B was no easy task.

The process of importing data feeds from sources on the web into Microsoft Office Excel can vary depending on your source material. Microsoft Office can natively import XML data feeds, but only if they are formatted properly. In our examples below, we will look at importing XML directly from a source, importing a Comma Separated (CSV) source, as well as importing XML that needs to be formatted properly.

Source Data

The wholesaler data we have to analyze was offered in XML format or as a CSV file, depending on the wholesaler. The most flexible method of working with external data from web services is using XML. When Excel imports XML Data, it creates an XML Table in a worksheet. This gives us a clean and easy method for sorting data live from the web. Any time we wanted the most up-to-date information, we’d just press the Refresh button in the Data tab of the ribbon.

As for the CSV files, Excel allows us to import Comma Separated Values much in the same way that we can with a XML Source. The downside of importing a CSV source directly is that you have less control with the data as compared to XML. What we had to do was find a way in which we could download a CSV file and convert it into a properly formatted XML file for use with Microsoft Office Excel.

Converting CSV to XML

In order for us to be able to create the necessary XML Data that Excel needs, we had to do a little scripting. As our data already resided on a web server, the easiest choice for us was to use PHP. A PHP script was created that took a downloaded Comma Separated (or any kind of Text-Delimited) file, read each line of the CSV, and then converted it into a XML file.

The original script we were using was very primitive, requiring hard-coded columns, input filenames, and output filenames. Although this was the case, it did the job perfectly. In the right hands, this script could most likely be optimized for use as a universal converter.

Although importing a XML File into Excel is rather simple, we learned that Excel doesn’t quite manipulate the imported data in an expected fashion.

Improperly Formatted XML Source

Importing XML data into Microsoft Office Excel is a fairly easy task. The problem is that Excel will attempt to guess what each column’s data type is suppose to be. This can cause errors, especially when attempting to calculate dates and numbers. In order for Excel to know the column value type, a XSD (XML Schema Definition) file must be created. A XSD file defines each column in your XML file.

The XSD file can be used to define each column/XML element as a string, integer, decimal, date, and more. When Excel sees that a XML file has a XSD file associated with it (which is defined in the previous PHP files), it will know exactly how to handle each column when the data is imported. For each CSV file converted, a separate XSD has to be created as each converted CSV has a different set of columns/XML elements.

Although the XSD is able the define the data being imported into Excel, the data in the XML file must be formatted correctly as well. In our PHP script from earlier, different built-in and self-created functions had to be used. First, for any field that is going to have HTML tags in it, we have to ensure that line is encoded using the htmlspecialchars() PHP function. For those fields that either have a date or number in them, two functions were specially created: date2xsd and txt2numdate2xsd would take any date field and convert the the correct XSD syntax (Y-m-d). txt2num takes integer and number fields and removes commas (,) from them. For all other fields, the trim PHP function is used to remove any leading or trailing whitespace.

Automating the Conversion Process

Reading through this, you can see that trying to manually perform these tasks can be rather time consuming. The above processes can be automated to allow less interaction from the user. In Windows, this can be accomplished using WGET and the Task Scheduler. In Linux, WGET and Cron Jobs can be used. As our server was Windows based, we will continue on that path.

The first job was to create Windows Batch Files (.BAT) for the tasks we wanted to automate. For our workflow, we created three different batch files: a daily job, a hourly job, and a job to run every 3 hours. Each job was assigned to a different data source, depending on how often that feed updated throughout the day.

Each batch file mostly ran the same procedures. For CSV-based sources, the batch file would delete any previously downloaded CSV files to ensure we’re starting fresh. After doing that, the WGET was used to download the most recently available data from the source. Once this was complete, the PHP Command Line Interface was used to process the downloaded CSV and convert it into XML, as well as assigning it to the appropriate XSD file. This new XML file is given a temporary name to prevent losing any previously created XML Files. If there are no errors, the previously created XML file is deleted and replaced with the one we just made. The final step now deletes the downloaded CSV file from earlier.

Reviewing Product Information

The data from our wholesalers are now in a format that would be of most use to us. The next task was using that data to figure out who could offer us the best deal. To accomplish this task, we’ll need two workbooks: one for comparing data and another for reviewing the data. (The use of two workbooks will make sense later.)

Before we could even compare prices, we had to see what products were available to us and what we wanted to sell of those available products. This we called our Master Product List. The Master Product list was a manual workbook that included information not just from our wholesalers and the manufacturer, but included custom information about the products. This information would be used to create the product titles, descriptions, specs, and more (including reviews). All of this was separate from the product pricing.

This list also played a special part in helping us determine if there would be any possible inventory issues, such as the availability of a product. When a product was marked as Discontinued by the manufacturer but stock was still available, the row containing that product would turn orange indicating a Closeout. This continued across all rows in which the color would change depending on that product’s status.

Once all this information was updated, we could update the store with our product selection. But that was just another piece to this giant jigsaw puzzle.

Price-Check on Column F

We now knew what products we would like to make available at LocationStore. But we still had to figure out how much we were going to sell these products for. This is where our Master Inventory List would come into play.

The first step is for the Excel workbook to examine all the data we pulled in from the various wholesalers. This involved pulling in all the special XML files we created earlier. Then, using the Manufacturer Part Number as the identifier across all wholesalers, Excel formulas would calculate the wholesale price of a product against a special formula used to determine the best price we should sell the product for. If the product was in stock at that wholesaler and it turned out to be the best price, our product sheet would update with that wholesaler’s pricing and inventory information. Of course, a price override feature was included as well if we felt like selling the product a price we wanted.

After reviewing the data to make sure there weren’t anomalies, the updating pricing was ready to roll out to the store.

Updating the Store

With the new inventory and pricing updated accordingly, it’s time to get this data into our store. This part, unfortunately, wasn’t as automatic as was getting the data from our wholesalers.

First, we had to upload the information we gathered from our Master Product List. This would ensure any old products were removed as well as adding any new products. New products would default to the MSRP of that product. Once this step was done, we would update the Master Inventory List workbook to ensure it had our new products. It would then be able to include those in the price comparison. After a spot-check of the information, it was then exported to be uploaded to the e-commerce suite we used.

 

As long and drawn out as this explanation may be, in reality, the updating only takes five to ten minutes at most. Some days would take longer due to the removal or addition of products. For the most part, updating the site was relatively fast. It was definitely a lot faster if we did not have these scripts and Excel doing a lot of the work for us.