Folsom points are the gold standard for finely crafted lithic projectile points and modern knappers find them very difficult to replicate. As a result, archaeologists, collectors, and knappers alike, have suggested over time that they were the product of a few craftsmen: people who made all the points for the band or possibly for several bands. I was a strong proponent of this craftsman theory in the early years of my archaeological studies. When I attended graduate school in the 1980s I wrote about the craftsman and continued to believe in the concept into the late 90s. See When I think back over these years, I realize I never had any hard facts or evidence for either position. I only had opinions. Today as I write this, I have some additional information from which I can form a more illuminated opinion. The purpose of this webpage is to share it with the reader. This new insight is in the form of probabilities and statistics, so if the reader is averse to these things one may not want to continue. On the other hand, I will point out that probabilities and statistics are the reasons one ultimately loses money in Las Vegas. Also, if the reader is not familiar with the process of creating a Folsom point, I suggest they read ## Background (research prior to 1/1/2001)In the 1990s and early 2100s I was using finite element analysis (mechanical engineering software) in an attempt to understand flake mechanics. Specifically, I was focusing on the fluting process of the Folsom point. To be mathematically correct, I wanted to use measurements of preforms and channel flakes that matched the archaeological record. Fortunately, I had sufficient examples in the Baker Collection to obtain these measurements. However, I soon discovered that there was one measurement I couldn't obtain. This was the thickness of the channel flakes from the two opposing faces of the same point. I obviously could measure the thickness of the channel flakes, but I didn't know which ones came from Face A (first removal) and which came from Face B (second removal). In different words, how did the dimensions of channel flakes from Face A differ from those from Face B? One day my luck changed. While looking at Folsom preforms with Bob Patten, we discovered the preparation work for fluting Face A differed from that of Face B. Figure 1 is a fragment of a preform abandoned during manufacture. As I wrote in the previous paragraph, we discovered the dressing flakes scars on Face A were different from those on Face B. Basically, the dressing flake scars on Face A are wider (larger) than on Face B. This can be easily seen in Figure 2, which depicts three more Folsom preform fragments broken during manufacture. Unlike the one in Figure 1, which was broken before Face B was fluted; these were broken during the fluting of Face B.
channel flakes with wide dressing scars are more likely to have come from Face A, while those with narrower scars come from Face B. I underlined this statement because it is the basis for the following analysis.
I chose 80 channel flakes from the Baker Collection that were intact enough to see the dressing scar pattern. I then visually forced ranked them by dressing scar width. After the forced ranking Channel Flake #1 had dressing scars that were narrower than the other 79 channel flakes. Additionally, Channel Flake #37 had scars that were wider than those numbered 1 to 36 and narrower than those numbered 38 to 80. After the forced ranking, I measured the width and thickness of the 80 channel flakes. I knew the channel flakes numbered 1 to 40 should relate to Face B, and those number 41 to 80 should relate to Face A. And, I fully expected that there would be a difference in the average values of width and thickness between the two groups. To my surprise there was not. Based on t-tests, channel flakes from Face A and Face B have a 84% chance of being equivalent based on width and a 48% chance based on thickness. Table 1 presents the numeric results along with data from the literature for three other Folsom Sites (Tunnell and Johnson 1991). To further emphasize the statistical sameness of the two faces, I also prepared Figures 4 and 5. These present the actual values for width and thickness of each of the 80 channel flakes in their forced ranking order. Figures 4 and 5 make a convincing argument that the first 40 channel flakes (which I consider to be from Face B) are identical to the second 40 channel flakes (Face A).
The research presented up to this point was completed by the fall of 2000 and I presented it at the Plains Anthropological Conference that year in St. Paul, MN. ## Current Research (more than nine years later)In January, 2010 I was discussing Folsom Point manufacture with John Garrett, a fellow engineer and amateur archaeologist. I showed him preforms and channel flakes including those in Figures 1, 2, & 3. I pointed out that the dressing flake scars on Face A were wider than those on Face B. I explained to him that this was because the preform was narrower when dressing Face B. I said it in a manner that sounded like it was fact. Yet, I didn't know this to be correct: it was only a theory and I had never tested it. These discussions with John compelled me to test the theory. Additionally, I wanted to quantify the difference in dressing scar widths from the two faces. To accomplish these objectives, I needed preform fragments that were broken as a result of Face B fluting attempts. Only those would have remnants of the dressing scars on both faces. Additionally, the fragments needed to be large enough to be able to measure the width of at least two adjacent dressing scars on each edge on each face. I would have preferred to have had more than only two, but that would have reduced the number of preforms in the test population. With the requirement of only two adjacent scars I was able to find 16 preforms in the Baker Collection. Figure 6 depicts these preforms, and as the reader can see, the size and material type varies considerably within the sample. Additionally, 12 of these preforms were found within 22 mile radius of each other and 15 within a 50 mile radius. One (1) was found more than 300 miles from the rest.
^{5} Figure 9 depicts Face B flake scar width, less the effects of preform width, verses Face A scar width.
Removing the effects of preform width (Figure 9) did not alter the fact that Face B scars are narrower than Face A scars on all 16 preforms. I wondered why this was the case. Nor, did it alter the trend that Face B scars increase in width as Face A scars do. So, a second question is To test if different lithic material types might have some influence on flake scar width I returned to my data set of 80 channel flakes in Figures 4 and 5. Remember these had been ordered (ranked) by dressing scar width. In the 80 there were 71 that were represented by 10 material types with more than one channel flake in each type. The other nine were singletons representing nine different, additional materials, which I placed in a group I called singletons. Table 3 represents the results of the single factor analysis of variances (ANOVA in Excel) of the 80. As can be seen material type does not seem to be a factor in dressing scar width. Instead, it is far from being important with a p-value of 0.36. I made a second run without the singleton group and the p-value changed to 0.32.
The results of the ANOVA work in Table 3 is worthy of a few more comments. Ultimately it is saying that rock type does not influence the Folsom knapper's product. This realization then raises another question. Does this mean the physical properties, such as elasticity, hardness, brittleness, etc. do not vary from rock type to rock type? Most proficient modern knappers would strongly argue that these properties do vary. They could offer the example of heat treating and how it greatly alters these properties in some materials and I will agree that it does. Hypothetically then, would a Folsom knapper working heat treated stone still make a channel flake with the same width, thickness, and dressing scar widths as he does working untreated stone? I suggest that he would. I further suggest that the Folsom knappers, as do proficient modern knappers, obtain feedback from working the stone and this feedback permits them to adjust their application of force to produce the same product they produced the previous day with a different stone type. And, they probably are not conscious of the tiny adjustments they are making. They may say "this rock is different or harder", but their product remains the same. I had eliminated rock type as a possible answer to my two questions, so the two still loomed large. Why were the dressing scars on Face B narrower than those on Face A? And, why was there a linear trend in the data for the 16 preforms in Figures 8 & 9? The answer had to be associated with the knapper. So I asked a knapper. Specifically, I asked Bob Patten who I mentioned earlier. Bob can replicate a Folsom point and he had a good answer to the first question. He suggested the knapper can be bolder when dressing Face A, because the preform is thicker and stiffer. Additionally, the knapper has little invested in the preform at this stage of the process, so in a sense he throws caution to the wind. After Face A has been successfully fluted, the preform is thinner and more fragile. So, the knapper is more cautious with his work and that caution results in narrower, less risky flake removals. Although I am not a knapper, I can relate to this and I believe it is correct. So, my first question was answered.
To test for multiple knappers, I performed some stochastic computer modeling. First, I made the following assumptions:
I then mathematically created 200 hypothetical preforms by randomly sampling the two normal distributions that represent the dressing flake scar widths of Face A and Face B. Figure 10 depicts how these 200 preforms distribute on a graph.
The 200 hypothetical preforms in Figure 10 cluster in a cloud around the intersection of the two averages of 5.3mm and 2.7mm. In statistics if one can't visually see a trend in the dots, there isn't one. And there isn't one in Figure 10. The mathematics bears this out as the linear regression line explains less that 1% of the variation. Therefore, with the assumption that the knapper tries to make the same product each time, it is not possible for a single knapper to have created the trend seen in Figures 8 and 9. It occurred to me that my 16 preforms represent only a small sample of the archaeological record. Maybe it was possible that the trend was a result of the sampling and it did not exist in the larger population? So, I sampled the 200 hypothetical preforms, 1000 times, by randomly pulling 16 preforms at a time. As a result, I can state the odds of creating a linear trend with an r
Returning to Figure 9, I removed the regression line, and added a line from the center of the cluster (5.3 mm, 2.7 mm) to the most distance preform. This is shown in Figure 11. I reasoned that if there was a second knapper responsible for any of the preforms then the one furthest from the center would most likely be one of his. Using stochastic modeling again, I assumed the null hypothesis was that there was only one knapper. Then I returned to the 200 hypothetical preforms in Figure 10 and increased the number to 1000. I calculated the distance of all 1000 from the center of the cluster (5.3 mm, 2.7 mm) and determined how many were equal to or great than 3.7 mm, the distance to the furthest preform in Figure 11. Only three of the 1000 met this criterion, which is a p-value is 0.003. This p-value is quite small, so I rejected the null hypothesis and accepted the hypothesis that this most distant preform was made by a second knapper. At this point I had convinced myself that there were at least two knappers visible in the data, Knapper A who made 15 of the preforms, and Knapper B who made the most distant preform in Figure 11.
Is there still another knapper visible in Knappers A's 15 preforms? To answer this question, with the null hypothesis that Knapper A made the 15 remaining preforms, I calculated a new center for the 15 preforms and the distance to the most remote. I re-created the 1000 preforms with the new center. This time the odds of this most distant preform occurring increased to eight out of 1000 or p=0.008. This again is a small p-value, so I chose to reject the null hypothesis that Knapper A made the 15 preforms and accept the alternate hypothesis that there was a Knapper C who made the most distant of the fifteen. I continued this process and found that two more preforms that had very small p-values. However, the p-values rose rapidly after removing these first four. So, I am convinced there are five Knappers (A, B, C, D & E) visible in the 16 original preforms. Knapper A made 12 of them and Knappers B, C, D & E each made one. Figures 12 & 13 illustrates which preforms belong to which knappers.
After looking at Figure 14, the reader is probably thinking that the find locations of the various preforms influenced the assignment of the preforms to the various Knappers. I can tell you that they did not. In fact, the map in Figure 14 was an afterthought created out of curiosity. I was as surprised as I suspect the reader is, as Figure 14 is remarkable support for the discussion. What are the odds of finding evidence of only one knapper within a 22-mile radius? To put this question into better perspective, consider the fact that this area also produced 77 finished Folsom point fragments, 147 additional preform fragments, and 415 channel flake fragments. And this represents only the material that was found by my father and me. How much more material is still on the ground or in other collections? That said, what is the possibility that the 12 preforms of Knapper A, which were found in an area with a 22 mile radius, are the result of collection bias? Again, what are the chances my father and I were simply lucky enough to find the products of only one knapper in this area? To answer this question let's assume that there were only two knappers in the subject area and that they made an equal number of points over time. This hypothetical case is identical to flipping a coin where only heads or tails are possible outcomes. So what is the probability of flipping heads twelve (12) times in a row? This question has an easy answer, which is 0.5 Suppose the Knapper A preforms were actually made by a number of different individuals, but the variation in the dressing scars between the different individuals is so large that they overlapped each other. Then, the statistics can't separate them. This would be a classic example of the variation within the group (the individual) being greater than between the groups (individuals). The only way I can see how this might be tested is with experimental archaeology and modern knappers. Unfortunately, I don't know of that many good Folsom point knappers. So, at this juncture, I will continue to believe that Knapper A is a single individual who is responsible for the 12 preforms in Figures 12, 13, & 14. The other area of preform concentration in Figure 14 is the West Area, which is the location of Knappers B, C, & D. Although it is not apparent in the Figure, this West Area is similar in size to Knapper A's Area, or about 20+ miles in radius. However, the density of artifacts is less than 1/10 of that of Knapper A's Area based on finding only three finished Folsom point fragments, nine additional preform fragments, and 35 channel flake fragments. So, at first glance, the West Area and Knapper A's Area seem to be telling conflicting stories: the area with the most diagnostic Folsom stuff has only a single knapper, while the area with more knappers has the least Folsom stuff. The answer is that the amount of Folsom stuff is not related to the number of knappers in the area, but to the time the knappers spend in the area. In different words, the volume of stuff is related to knapper-years in the area. (See ## Concluding RemarksIf the reader hasn't figured it out by now then you probably are wondering why I titled the paper The most important thing about the work presented here is the strong support for a single expert knapper who is making all the points for the various Folsom hunters in an given area. As I stated in the beginning of this paper, I used to believe that this was the case because the making of the point is so difficult for modern knappers. However, in the 1990s I began to change my opinion to a belief that each hunter made his own points. I began to believe that all the everyday Folsom hunters were just exceptional flint knappers. The research presented here has caused me to return to my earlier belief. Such is science. |

## References | ||

Frison, George C. and Bruce Bradley | ||

1980 | Folsom Tools and Technology at the Hanson Site, Wyoming. Albuquerque: University of New Mexico Press. | |

Tunnell, C. and L. Johnson | ||

1991 | Comparing Dimensions for Folsom Points and Their Byproducts from the Adair-Steadman and Lindenmeier Sites, As Well As a Few Other Localities. Texas Historical Commission, Austin. | |

Whittaker, John C. | ||

1994 | Flintknapping-Making & Understanding Stone Tools. University of Texas Press, Austin. |

## Notes | ||

1 | Abstract of paper presented at 2000 Midwest Archaeological / Plains Anthropology Conference, St. Paul, MN.
Folsom Point Manufacture - A Common Task Preformed by All
| |

2 | Figure 1 depicts a preform that is a very rare artifact. It is rare because unlike most Folsom preform failures, which occur during the high-risk steps of fluting Faces A or B, this failure occurred during the low-risk steps of preparing Face B for fluting. The fact that there are only seven (7) exhibiting this type failure of the 264 preforms in the Baker Collection is indicative of how low-risk preparing Face B really is. | |

3 | Distinguishing Face A from B on a Folsom preform can easily be demonstrated in Figure 2. The most obvious indication is the deep scoop on Face B immediately above the striking platform. This is the negative scar of the bulb of percussion that is on the channel flake. On Face A, the striking platform and this negative scar (scoop) have been removed in preparing for fluting Face B. They have been removed by shortening the preform and "turning the edge."^{4} If the negative scar is left on the preform, Face B could not be properly fluted because the Face B channel flake would not run past this thin section of the preform. | |

4 | "Turning the edge" is a term I picked up from the modern knapping community. I understood the process before I learn what the knappers called it. In 1998, I called it "reversing the bevel." See Stage 8--Channel Platform Preparation (Face B). | |

5 | The multiple regression equation is: ScarWidth _{FaceB} = (0.4463)*ScarWidth_{FaceA} + (0.0435)*PreformWidth + 0.35The Face B values plotted in Figure 9 come from: Figure9values _{FaceB} = ScarWidth_{FaceB} - (0.0435)*PreformWidth
| |

6 | Shortly after developing my stochastic method, I attended the 2010 Paleoanthropology/SAA meetings in St. Louis. At the meetings I heard a statistical paper presented by Erik Otárola-Castillo (Iowa State University) and I felt this was the person who could critique my methodology. So, I approached Erik and to my surprise he was a tremendous help. Plus, he was more gracious that I would have ever expected. One of the issues Erik had with my methodology was that I was creating my stochastic preforms by randomly sampling from a normal distribution. Fundamentally however, flake scars widths cannot be normally distributed. Flake scar widths can never be equal to or less than zero mm. Yet, the normal distribution has the possibility of creating negative values all the way to negative infinity. Erick suggested I convert my scar width values to logarithmic values and then re-apply my stochastic method. When I did this it raised the p-values of the preforms of Knappers C, D, & E. For example Knapper E's p-values increased from 0.012 to 0.106. This increase in p-values did not alter their occurrence in the process nor the fact that they were different from the 12 preforms of Knapper A. The other critique that Erik offered was focused on my strong adherence to rejecting the null hypothesis if the p-values were 0.05 or less. He reminded me that terms such as "statistically different" or "statistically significant" are not hard boundaries. We in the social sciences have arbitrarily and almost tacitly established this threshold so we can make decisions and answer questions with a forthright yes or no. But there really is a gray area here that we are ignoring. Erik further pointed out that statistical power is lost with small sample sizes, which degrades the power of the 0.05 p-value threshold. My 16 preforms is a small sample size. As a result of this critique, I rewrote the paper and removed words like "statistically significant". This philosophy also permitted me to present my stochastic method in good conscience. | |

## Return to Paleoindian & Other Archaeological Stuff |