NiallBeag wrote:
Turning lead into gold was once considered "science"...
Frequency analyses cannot produce statistically trustworthy results without having a heck of a lot of language data. Even now, corpuses can be spoiled by selective choosing of source material, and they contain billions of words. No matter anyone's best intentions, in 1966 it would have been impossible to build a large-scale, statistically relevant corpus, so they would have had to be very selective in what they thought was worth including, and that selective step destroys the objectivity of the study.
Look, I'm off this topic as of this post. I mean seriously. The General Service List of English was done in the 1950s and even with the issues that it has it's to this day considered "excellent" despite its age, see
Corpus Linguistics: An Introduction. But since it's been updated there is more modern adata but that doesn't completely invalidate it. The French carried one out in the 1950s as well which [url="http://expsy.ugent.be/subtlexus/Brysbaert%26NewBehaviorResearchMethods.pdf"]is still also highly regarded[/url].
While modern methods for collecting data has made it easier to get massive samples no one in the field in the literature I have read has ever compared the work of frequency analysts from the past few decades to medieval superstitions.
The Routledge Handbook of Corpus Linguistics wrote:
Language teachers have long used lists of important vocabulary as a guide to course design and materials preparation, and corpus data have always played a major part in developing these lists. As early as the 1890s, Kaeding supervised a manual frequency count of an eleven-million-word corpus to identify important words for the training of stenographers, and similar counts were used by language teachers from at least the early twentieth century onwards (Howatt and Widdowson 2004: 288—92). The argument for prioritising vocabulary learning on the basis of frequency information is based on the principle that the more frequent a word is, the more important it is to learn. Proponents of a frequency-based approach point to the fact that a relatively small number of very common items accounts for the large majority of language we typically encounter. Nation (2001: 111), for example, reports that the 2,000-word families of West's (1953) General Service List account for around 80 per cent of naturally occurring text in general English. This suggests that a focus on high-frequency items will pay substantial dividends
for novice learners since knowing these words will enable them understand much of what they encounter (Nation and Waring 1997: 9).
The development of computerised corpus analysis has made the job of compiling word-frequency statistics far easier than it once was, and has given impetus to a new wave of pedagogically oriented research (e.g. Nation and Waring 1997; Biber et al. 1999; Coxhead 2000). Importantly, the widespread availability of corpora and the ease of carrying out automated word counts seems also to offer individual teachers the possibility of creating vocabulary lists tailored to their learners' own needs. However, it is important to bear in mind that corpus software is not yet able to construct pedagogically useful word lists without substantial human guidance.
See that last sentence.
I am not saying that either of these works, BG or BC, do not have issues or could not be improved upon today but come on... Abject dismissal as "amateur", "opinion", and comparing scholarship from the 60s to alchemy simply on the basis it's from the 60s? Attack the data. Attack the sample size as a fact, not how large you think it was, but the actual number. Attack the collection methods described in BG. The actual collection methods. Attack the analytical methods used and why you think the margin of error accepted by the author of BG is too high.