The USDA Ag census says that there were 12,549 CSAs in 2007, but Local Harvest has only 2,700+ in their database. This is a problem, and the problem is not with Local Harvest, but rather with the USDA methodology. If you go to the USDA website http://www.nal.usda.gov/afsic/pubs/csa/csa.shtml you will see this number and you can look at the report. I recommend Appendix A, which details the methodology used. Here are some of the problems.
1) Mailout and mailback was the primary data collection method.
2) The mailback was followed up by electronic data collection over the Internet and phone followup to non-responders.
3) Paid media was used to reach a narrow target group of difficult-to-reach farms.
4) Missing data was imputed using past survery reports and methods from the 2002 census.
5) Statistical estimation was used to cover non-response when other methods failed.
There are so many problems with US Census procedures that the Democrats have been pushing for years for random sampling, rather than trying to measure the whole population, as the Republicans support. This antipathy is political, of course, because a true random sampling would redraw Congressional districts to the favor of the Democrats, so of course the Republicans want to keep the present flawed system. It should be noted that the best overall population census was the first, commissioned by William the Conqueror in 1085, to actually get a real count of the wealth of Anglo-Saxon England, which was now under his control. Even then, the detailed count of the Little Domesday Book was superseded by the Great Domesday Book, simply because of the time and manpower needed for detailed data collection. Wikipedia has a very nice discussion at http://en.wikipedia.org/wiki/Domesday_Book The point here is that even with a subjugated nation and plentiful men-at-arms to make counts, the original plan had to be modified.
Okay, per the mailout and mailback problem, one of the little cliches in anthropology is that when a sociologist wants to do a survey on how much people recycle, he (or she) asks the informant and accepts the answer as true. What the anthropologist does is to pick through their garbage to actually see how much they recycle. For those of you who have done data collection in the field, you know the myriad of questions you go through and the thousands of little decisions you make, fitting normal human responses into pigeon-hole categories on your data sheet. There is a very good reason Mark Twain popularized Benjamin Disraeli's famous remark, "There are three kinds of lies: lies, damned lies, and statistics." This is simply because the errors start right at data collection.
Now, once you have non-responses (and the 2007 survey had an 85.2% response rate versus 88.0% in 2002 and 86.2% in 1997) you have bias. However, in this survey, "There was no effort to measure nonresponse bias for the census (page A-11)." Yet, there was a concerted effort to do Internet and phone followup to the non-responders. So, rather than make an attempt to get a grip on the bias generated by a 15% non-response rate, the USDA simply ignored the problem in favor of harrassing people so they could have the appearance of measuring the whole population. [Sidebar: a random sample means that every datum in the population has an equal chance of being in the sample - this is far superior to trying to measure the whole population. Population in this sense is the whole set of items to be included, not just the human population of the US.]
So, after there was still a non-response problem, the USDA used paid media and imputed values through past methodology account for the non-responders. The term in the Appendix for this effort is called Coverage Adjustment (page A-7). Now, the methodology is including more data that doesn't exist into the database. This is just nonsense and raises the question of how much effort is being expended to get specific results for a specific political purpose. Certainly, these adjustments are amplifying the bias the database already contains.
Finally statistical estimation was used when all other efforts failed to cover the non-responders. So, after they tweak the database so that it says what they want it to say, they subject their skewed and biased data to some statistical fancy footwork to shore up their results. As an example, look at Table A of Appendix A under Farms by Value of Sales under $1000. The Total of 688,833 includes Nonresponse Adjustments of 12.45% and Coverage Adjustments of 30.03%. In other words, adjustments account for 42.48% of this category! I ask you, does it make sense that you can get a valid result if close to half of the data for a category comes from adjustments? Of course not. You know, these people are getting paid a GOOD SALARY to put out this kind of nonsense. Meanwhile, you and I cannot get paid a fair price for the food we grow.
Finally, per the Local Harvest database - their database consists of motivated CSA farms who want to sell shares. Thus there is no real bias because the database is a self-selected sample. The listing is free and there is no long-standing bias against reporting (I.e. no "I'm from the government and I'm here to help you." - Run, Forest, run!). There is also no harrassment if people don't list information, unlike the USDA census. The Local Harvest database has a limited use, but we know the limitations. The USDA, on the other hand, is intentionally pulling a fast one on the public.