There are 6 mailings (1, 2, 3, 4, 5, 6). What I mean by a mailing is that each consists of X number of names from multiple mailing lists (6 mailing lists in total to represent each specific mailing). They can borrow names from each other. These names on the lists will be merged and purged against each other (some names can be duplicates) in a priority order where the specific mailing it is has the highest priority. So if it is a "1" mailing it might have "2", "3", "4" parts in it as well, but any names that are in "1" and "2" or "1" and "3" or "1" and "4" will be mailed as a "1" Title and those names will not be included for the other Titles.
The names will be selected by a person based on good performance. So "1" might have a selected 10,000 names, "2" might have a selected 5,000 names, "3" might have a selected 3,000 names, and "4" might have a selected 1,000 names. However, the number of names mailed will not be the sum of those numbers because some of the names will be duplicates. So after the merging and purging of the names "1" will survive at a high rate 99.9% or so while "2", "3", and "4" survive less.
The point of all that was just to get you to understand the set up. The question I want to ask is coming up.
These mailings get mailed and the people buy things from them. Each day a certain number of names are put into their own list, the buyers list. This buyers list is broken into 3 categories; they can be called A, B and C. The buyers list is always included in a mailing so that the people that just bought will be mailed again.
The number of names on the buyers list needs to be predicted about 7-14 days prior to actually mailing so that a number can be estimated for the total names in the entire mailing. As the day to finalize the mailing approaches the closer this prediction is the better.
The buyer's list heading on the mailing will have a date range, for example 06/01/2010-06/18/2010. This means that the heading on the mailing represents all the people that bought from each of the 6 mailings defined above in that span. The number of names before any kind of merging or purging has to be predicted.
As before this buyer's list heading is selected by performance. You can select from each of the categories (A, B, C) and you can select within each of those categories dollar ranges ($50+, $40+, $30+, $20+, $10+). The dollar ranges mean those people selected in that square bought a $50+ item, $40+ item, etc.
The data given is:
1.) About 5 months worth of actual daily numbers of how many buyers per day in each category.
2.) Actual selections by $ amount for 5 months worth of date ranges.
The things that have to be predicted are:
1.) How to predict how many buyers will be added to the list per day.
2.) How many people the selection made will be.
#1 is predicted already, so you can assume you have access to that predicted number already (but if you have a really good way to predict that then please say).
#2 is the main prediction we want to do. The way I do it relies on the accuracy of #1 though.
So here is how I did it:
I took the 5 month history of daily buyers which was broken down like this for each mailing:
Mailing 1
Category A Category B Category C Total
Day1 some # some # some # some #
Day2 some # some # some # some #
Day3 some # some # some # some #
Day4 some # some # some # some #
Day5 some # some # some # some #
Day6 some # some # some # some #
DayN some # some # some # some #and figured out how much of a % each mailing was from the total for each day.So the table would look like this:
Category A Category B Category C Total
Day1 some % some % some % some %
Day2 some % some % some % some %
Day3 some % some % some % some %
Day4 some % some % some % some %
Day5 some % some % some % some %
Day6 some % some % some % some %I did that by dividing the Category A, B, or C number by the Total for that day. The Total % was obtained by adding all the Totals for the day from each of the mailings (all 6) and dividing each mailing's Total # by that combined total of all the mailings.Next I went through the last 5 months of old mailings and looked at their buyer's list span. I went back to the actual totals (1st table above) and calculated the total from each mailing for each span. So what I ended up with was:
In the February mailing, for example, the span might be 02/01/2010-02/20/2010. I added all the "1" buyers, "2" buyers, "3" buyers, "4" buyers, "5" buyers, "6" buyers. and ended up with:
1= some #
2= some #
3= some #
4= some #
5= some #
5= some #
I did this for 5 months worth of mailings. That is 1 mailing per month of each type. So 5 "1"s, 5 "2"s, 5 "3"s, 5 "4"s, 5 "5"s, 5 "6"s.
Next, I got the total of each of the selections for each category and $ range. The table would look like this:
mailing1 mailing2 mailing3 mailing4 mailing5 mailing6 Category A $50+ $40+ $30+ $20+ $10+ Category B $50+ $40+ $30+ $20+ $10+ Category C $50+ $40+ $30+ $20+ $10+Once I populated all the tables for each month (30 tables in all) I divided each of the numbers in each of the tables by their respective Total mailing # (the one in the February example). This gave the % that each Category and $ combo is from the total selection for the buyer list time span per month.
Finally, I averaged the Total %'s from the second table above (5 months worth) so I could use this % later in estimating. This would represent the % that each mailing # is of the total time span from the buyer's list.
I also averaged each of the 30 table's corresponding mailing columns. So for instance in mailing X I had 5 tables (one for each month). Those 5 tables each had 6 "mailing #" headings which meant that names from each of those mailing's lists were used in that month's mailing X. So for mailing X I averaged 5 months worth of mailing 1 category A $50+, mailing 1 category A $40+, etc. I did this until I populated a percentage table of averaged percents. I would later use this table to estimate the number of names within some specific mailing/category/$ combo.
Once I had these averages I looked on the predicted report of buyer's per each day from the the #1 predictions that need to be made heading above. I added these names together to get some total # for some specific span of time.
I multiplied that number by the first average % I described 3 paragraphs ago. This gave me a breakdown of how the divisions were divided up in just one mailing X:
mailing 1: some # obtained by the average of all days mailing 1 total %'s in table 2
mailing 2: some # obtained by the average of all days mailing 2 total %'s in table 2
mailing 3: some # obtained by the average of all days mailing 3 total %'s in table 2
mailing 4: some # obtained by the average of all days mailing 4 total %'s in table 2
mailing 5: some # obtained by the average of all days mailing 5 total %'s in table 2
mailing 6: some # obtained by the average of all days mailing 6 total %'s in table 2
The next step was to further divide these mailings estimated numbers by category and $ amount.
Depending on the selection (all dollar ranges or categories don't have to be actually selected) the sum will change.
So by saying I take the selection of mailing 1, category A, $50+ dollar amount the cell for that would be populated with the result of:
"mailing 1: some # obtained by the average of all days mailing 1 total %'s in table 2" * the averaged percent
This gets the total of people in the mailing 1 selection that are in category A and bought a $50+ item.
If this was at all possible to follow the things I'm wondering are:
1.) What would be the best way to estimate how many buyer's per day? As of now I'm using an already estimated report that I'm not sure who creates. It is off by a few thousand per day usually.
2.) Not related to actually estimating the number of buyer's per day, but instead estimating how much our selection will be (mailing #/category/$ amount), given some already obtained buyer per day numbers, is averaging the %'s over a 5 month span an okay way to go about that? Or is there a better way.


Sign In
Create Account


Back to top









