Introduction
As I hope was made clear from my previous series, “Refining the Peak Oil Rosy Scenario,” the logistic equation developed by Hubbert for the analysis of oil production data (henceforth “Hubbert’s Equation”), has a number of inherent limitations, especially when it comes to modeling the decline-side of the production curve. In brief, Hubbert’s Equation inherently assumes that the increasing side and the declining side of the production curve will be symmetric and can be accurately described by a single rate constant for production “a” and single total recoverable amount of oil, Q∞. In particular, Hubbert’s Equation inherently assumes that a peak in production will occur and the decline side of the production curve will be the mirror image of the growth-side of the curve. This model cannot specifically account for changes (i.e., increases or decreases) in the production rate constant “a” or in total recoverable amount of oil, Q∞. that may occur on the decline-side of the curve.
As I showed in “Refining the Peak Oil Rosy Scenario,” when the assumptions of a constant “a” and Q∞ do not hold, and for instance, “a” or Q∞ have yearly fractional changes, the best fits using Hubbert’s Equation give systematically poor estimates of “a” or Q∞ as compared to the true values (in this case, known values because the data in my analysis was simulated data). Consequently, the predicted future production rates can be radically different from the true value. Using the linearized version of Hubbert’s Equation does not cure this problem.
To address this, I modified Hubbert’s Equation to include the possibility of accounting for fractional yearly changes in “a” (fca) or in Q∞ (fcq), as compare to the best fit estimates for “a” and Q∞, from the growth side and plateau region of the production curve. The testing of this approach using simulated data is described in part 7 of the “Refining the Peak Oil Rosy Scenario” series.
Some hypotheses to consider
How might society responds to the experience of hitting peak oil production?
On one hand, there may be an effort to conserve this important depleting resource by decreasing production. In this case I would expect to see, at some point at, or past, the plateau in production rate, a fractional yearly decrease (i.e., fca < 1) in “a” as compared to “a” on the growth side of the production curve. On the other hand, there may be an attempt to step up production in the hopes of continuing economic growth, maybe at least until alternative\ energy sources are found or developed, or because the price of oil has increased. In this case, I would expect to see a fractional yearly increase in “a” (i.e., fca > 1).
Either an increasing or decreasing “a” could occur in the face of changing total recoverable oil, Q∞.
In some cases, one or more of: increasing costs of oil recovery (e.g., due to a decreasing EROEI), decaying infrastructure, increasing regulations or outright prohibitions on pumping (e.g., consider the moratorium in the Gulf or off the Coast or California), the effective total amount of oil that is recoverable could decrease, as compared to the estimate of Q∞ from mainly the growth side of the production curve (i.e., fcq < 1). Alternatively, there could be a technological improvement in the recovery of oil (i.e., a greater percentage of oil in the ground can be extracted), or, new oil sources could be put into production (i.e., fcq > 1).
Of course, none of the above may occur, in which case there would no signs of a change in “a” or in Q∞ (i.e., fca =1 and fcq =1). In this case, the decline side of the production curve should be a mirror-image of the growth side of the curve.
I suppose that this might be the kind of scenario one would expect if one were monitoring micro-organisms or insects converting a finite fixed amount of an energy source into waste products. Thought of this way, my main hypothesis, is that when it comes to the use of their “energy source” of petroleum, humans will not behave the same as bugs, and, the growth and decline sides of the production curve will not be symmetric.
The USA data set and procedures for data analysis
The sources of data describing total USA petroleum production was already described and briefly considered in Part 5 of the “Refining the Peak Oil Rosy Scenario” series.
I relied on the EIA's Table 5.1 Petroleum Overview, which provides yearly production and consumption data back to 1949 for the USA . As I will discuss in some detail later on, Table 5.1 also give break down in the production data into three general sources.
Non-linear least squares (NLLS) analysis of total production
As in the preliminary analysis done in Part 5, I started my NLLS analysis with the whole span of the production data from1949 to 2009 and used Hubbert’s equation (e.g, eqaution [3] from Part 5) to obtain the best fit to this data.
Then I repeated the NLLS analysis, again using Hubbert’s equation, to obtain best fits to progressively smaller data sets: 1949 to 1999, 1949 to 1989, 1949 to 1979 and 1949 to 1969.
Figure 1 shows the best fit result from the NLLS analysis using the Hubbert’s equation for each of these time ranges. All of the parameters “a,” Qo and Q∞ were allowed to vary to minimize the sum of the residual sums of squares (Srss).
Discounting the 1949-69 period, where the NLLS analysis blows up (which is, as discussed previously, is likely because there was not enough plateau in the production curve to estimate Q∞), the longer the time period considered, the larger the Q∞ and the smaller the “a.”
This suggested to me that the production curve is not following a simple symmetric relationship where the growth and decline sides of the curve are identical.
For instance, look at the fit to the 1949-79 range of data in Figure 1: the fit on the decline side of the curve departs widely from the measured production data. Every measured value from 1980-2008 is to the right of the curve predicted from the best fit parameters. The same trend is present, though less prominent, for the best fits to the 1949-89 and 1949-99 ranges: for instance, all of the last fifteen years of production data are to the right of the best fit curves.
Modified analysis of the production data from 1980-2009
I used the best fit values of “a”, Qo and Q∞ obtained from fitting the 1949-79 using Hubbert’s equation, as “fixed” parameters for a subsequent fit to the 1980-2009 production data, using equation [9], derived in Part 7 of the “Refining the Peak Oil Rosy Scenario” series:
dQ/dt = (Q∞p∙ (fcq)(t – td)) / (1 + (((Q∞p∙ (fcq)(t – td))-Qo)/Qo)∙exp(-(ap * (fca)(t – td))∙(t-to)))/(Dt), [9]
I put quotes around “fixed” because, of course, I have to use these best parameters of “a”, Qo and Q∞ to calculate a new Qo for 1979 and then use this value as the starting input for equation [9] to use to analyze the 1980-2009 period. But, “a” and Q∞ were fixed to the values shown in the 1949-1979 column shown in Table 1 throughout the analysis. The parameters fca or fcq, or both, were allow to be variable for the fit to the 1980-2009 period.
Figure 2 shows the best fit to the 1980-2009 data (solid green line) when, along with “a” and Q∞, fca was also fixed equal to 1 (i.e., fca is not a parameter in this case) and only fcq was allowed to vary. That is, the solid green line in Figure 2 shows the best-fit obtained using equation [9] to the 1980-2009 data with “a” fixed to 0.0656 yr-1 and Q∞ fixed to 233 bbs, fca eqauls 1 and fcq equal to 1.0053.
For comparision, in Figure 2, I also included the full curves, previously shown in FIG. 1, for the best-fit to the full 1949-2009 data (dashed red line) and the 1949-1979 data (solid blue up to 1979 then dashed from 1980-2009) using the traditional Hubbert equation.
Discussion Points
Making long term predictions using the traditional logistic equation is dangerous
One of the most striking features of Figure 2, is how poorly Hubbert’s equation underestimated the present 2009 production rate when using only 1949-1979 data. Of course, this is due to the assumption inherent in this equation, that the decline side of production will mirror the growth of production. For instance, if we extend to best-fit out 30 years from 1979, the dashed blue line predicts 2009 yearly production would be ~1.3 bbls/yr instead of the actual value of ~2.6 bbls/yr—a 100% error! The lesson should be clear: trying to use the traditional Hubbert’s equation to make long term estimates of production is only as good as the inherent assumption of symmetry.
The fit using a variable fcq is a significantly better fit (p-value < 0.05)
Another note-worthy feature from Figure 2 is that the fit to the 1980-2009 subdata set using equation [9] appears to be better than the fit obtained from the fit to the full 1949-2009 data set using Hubbert’s equation, for the 1980-2009 time span of data (i.e., the dashed red line from 1980-2009). However, is the fit using equation [9] significantly better?
It is well known that when comparing the fits of two related or “nested” equations to a set of data, the equation with the larger number of parameter will always have an equal or better fit, such as quantified by having a lower Srss. It is also well known that a way to test if the fit using the larger parameter equation is statistically significant better, beyond simply adding the additional parameter, is the F-test.
For instance, for the case where n data points are used to estimate parameters for a larger parameter model (i.e., number of parameter = p2) and smaller parameter model (i.e., number of parameters = p1), then the F-statistic is given by:
F = ((Srss (smaller) - Srss (larger) / (p2-p1)) / ((Srss (larger)) / (n-p2))
F is the critical value for a certain probability value (p-value) and the degrees of freedom associated with the two equations (i.e., n-p1 and n-p2 degrees of freedom for the smaller and larger models respectively). The value of F can be looked up in a statistical table. Or using EXCEL, the value of F is given by the FINV function (specifically, FINV(p-value, n-p1, n-p2)).
In the present case, for the fit to the 1980-2009 data, the larger model, equation [9], has four parameters (p2 = 4, corresponding to “a”, Qo and Q∞ and fcq; fca was not used) and the smaller model the traditional logistic equation, has three parameters (p1 = 3, corresponding to “a”, Qo and Q∞). My view is that even though “a”, Qo and Q∞ are fixed to their best fit values from the NLLS analysis of the 1949-1979 data, they still play a roll in determining the best fit to the 1980-2009 data and therefore should be considered as parameters in the larger model. This tends to make the F-statistic larger, and therefore, more difficult to overcome for a finding of statistical significance, than if equation [9] was considered to only have one parameter, fcq, that is being compared to a three parameter equation, Hubbert’s equation.
The value of Srss (larger) is equal to 0.3050 which is the best-fit value obtained for this data range using equation [9], and Srss(smaller) is equal to 0.6213 which is the best fit value obtained for this same data range using the Hubbert equation.
Accordingly, my calculated F value equals 27 and this is greater than the F-statistic (0.01, 27,26) of 2.54. That is, there is at least a 99% probability that the fit using equation [9] is better than the fit using the traditional logistic Hubbert equation, other than being to due to just random scatter in the data. Or in other words, there is a less than 1% chance that random scatter in the data would explain the better fit of equation 9 to the data as compared to the Hubbert equation.
Similar fits and tests for significantly better fits were done to the 1980-2009 data using equation [9] where fcq was not used (i.e., fcq fixed equal to 1) and fca was allowed to vary and both fcq and fca were allowed to vary. The best fit when fca was varied was not significantly better than the best fit using the traditional logistic equation (p-value >0.1). Similarly the best fit when both fac and fcq where allowed to vary was not significantly better than the best fit when only fcq was varied (p-value >0.1).
But does the best-fit from this modified approached give a significantly better fit to the measured values for the data set as a whole (i.e., the full range from 1949-2009) than the Hubbert equation? Yes it does.
To answer this question, I needed take the sum of Srss for the best-fit of the Hubbert equation to 1949-1979 data (Srss=1.0293; Table 1) plus the Srss for the best fit of equation [9] to the1980-2009 data (Srss=0.3050). The sum of the two Srss values equals 1.3343. In comparison the best fit of the Hubbert equation to the 1949-2009 data equals 1.946 (Table 1). The Hubbert equation, again has three parameters, and the modified equation has four parameters and the total number of data points equals 60.
Therefore the calculated F value (Fcal) equals:
Fcal = ( (1.946-1.3343) /(4-3) ) / (1.3347/ (60-4)) = 25
Again, this is significant at p < 0.01.
Implications from the modified NLLS analysis
The statistically significantly better fit obtained using equation 9, estimates an annual increase in Q∞ (starting from the Q∞=233 bbs estimate from the fit to the 1949-79 data using Hubbert’s equation) of about +0.53 percent per year through the 1980 to 2009 period.
Why might there be an annual half-percent per year increase in Q∞, the total estimated recoverable oil? Possible answers to this question were already presented at the beginning of this article: improvements in the recovery of oil or new oil sources put into production.
Is there any evidence to support either of these scenarios? Yes, there are.
Improved recovery
Well, as far as improvements in recovery are concerned, according to Leonardo Maugeri, VP for corporate strategies and planning at the Italian energy company ENI, an important tertiary recovery technique, horizontal drilling, become commercially adopted in the 1980s:
One of the most important developments so far has been the horizontal well, a dramatic breakthrough compared with the traditional vertical drilling used since the inception of the oil industry. Commercially adopted in the 1980s, this technique is particularly suitable for reservoirs where oil and natural gas occupy thin, horizontal strata, or in sections where vertical drilling can no longer be useful. With their flexible “L” shapes, horizontal wells can change direction and penetrate a reservoir horizontally, thus “assaulting” virgin sections of a reservoir. Squeezing More Oil Out of the Ground
So, the improved recovery of oil associated with the widespread adoption of horizontal drilling in the 1980s might at least partial explain why there is a trend for Q∞ to increase by half-a-percent per year: as more and more wells use this technique the effective amount of recoverable oil increases.
New petroleum sources
What about new oil sources put into production from 1980-2009?
It turns out that the break down in the production data given in the EIA’s Table 5.1 provides some interesting answers to this question. Table 5.1 not only presents total petroleum production and consumption it also gives a break down in production from the lower 48 states,
A few trends are immediately obvious here, and they help explain why Q∞ would be increasing in the 1980-2009 time frame:
1) There was brief upward spike in the over-all downward trend in lower 48 state production running from about 1981 to 1989. Plus the decline side of the production curve is more gradual than the growth side.
2) There is a steady trend of increase in liquid natural gas throughout 1980 to 2009.
3) Alaskan production went from substantially nowhere in 1976 to a peak in production in 1988 and then when in decline thereafter.
Examining the components of total USA production
I thought that it would be interesting to examine the three components of production in some further detail, in the hopes of gaining some further insights into future production trends.
In particular I was interested in examining how the sum of these three individual components, when analyzed using my modified Hubbert equation [9], would compare to the analysis I did on the total production data.
Lower 48 state production
Figure 4 shows the NLLS best fits using the Hubbert equation to the full time span of 1949-2009 production data for the lower-48 states and also to progressively 10-year shorter spans:
Figure 4 shows the NLLS best fits using the Hubbert equation to the full time span of 1949-2009 production data for the lower-48 states and also to progressively 10-year shorter spans:
Once again the NNLS analysis blew up for the 1949-69 data range. The best fits to the longer data sets show the same trend as for the total production data: for progressively longer time spans, the estimated “a” and Q∞ progressively decreased and increased respectively. Once again this suggests to me that the production rate data for the lower 48-states does not follow the symmetric behavior inherent in the Hubbert equation. For instance, consider the best fit to the 1949-1979 data: almost every data point from 1980 and on is to the right of the best-fit curve.
Figure 5 shows the best fit value from the NLLS modified analysis of the lower 48-state production data, where once again, I have used to best fit values of “a”, Qo and Q∞ (0.0703, 32.05, 177.9, respectively) from the analysis of the 1949-1979 data using Hubbert equation as my fixed parameters for a subsequent fit to the 1980-2009 data using equation [9].
Once again the fit using equation [9] with fcq variable gave a significantly better fit to the 1980-2009 data range than the best fit using the Hubbert equation fit to the full data range (p < 0.01). However the fit using equation [9] with both fca and fcq variable gave a still better fit to this data range as compared to the best where only fcq was varied (p < 0.01). The solid green line shows the later's best fit, obtained when fca = 0.976 and fcq = 1.0097. This best fit results using equation [9] therefore suggest that in addition to an annual increase of almost 1 percent in Q∞, the rate constant for production in the lower 48-states was actually declining annually at about 2.4 percent. The 1 percent increase in Q∞ more than offests the 2.4 percent decline in “a” to give a more gradual decline in production than predicted using the traditional Hubbert equation.
Natural gas liquids
Figure 6 shows the NLLS best fits using the Hubbert equation to the full time span of 1949-2009 production data for natural gas liquids and for short time spans, analogous to that shown in Figure 4. The same trends are present in the best fits to progressively longer time spans, again suggest asymmetry in the production curve. In fact this trend is much more prominent than it was with the total production or lower 48-state production. Again look at the best fit to the 1949-1979 time span—after a local maxima in about 1972, the NGL production dips slightly an then takes off in 1980. Of course, the best fit to the 1949-79 data (blue line) using the Hubbert equation misses this completely.
Figure 7 shows the best fit value from the NLLS modified analysis of the NGL production data, where I once more have used to best fit values of “a”, Qo and Q∞ (0.0999, 1.59, 24.36, respectively) from the analysis of the 1949-1979 data using Hubbert equation as my fixed parameters for a subsequent fit to the 1980-2009 data using equation [9].
The fit using equation [9] with fcq variable gave a significantly better fit to the 1980-2009 data range than the best fit using the Hubbert equation fit to the full data range (p < 0.01). Unlike the lower 48-state data, however, the fit using equation [9] with both fca and fcq variable did not give a significantly better fit as compared to the best-fit where only fcq was varied (p > 0.1).
The solid green line shows the best fit, obtained when fcq = 1.0183 using equation [9]. The result suggests an annual increase of about 1.8 percent in Q∞. The extension of this trend out to 2030 looks quite dramatic and suggests an increasing important contribution of NGL to the total petroleum production, as both the lower-49 state (Figure 5) and Alaska production are predicted to decline (Figure 8, below). Will there be continued NG production to support this trend? That is a question for another day.
The solid red line in Figure 8 shows the NLLS best fits using the Hubbert equation to the time span of 1976-2009 production data for Alaskan production analogous to that shown in Figure 4 and 6. Including the set of zero or relatively near-zero values for 1949-1975 seriously biased the NLLS fit and gave poor fits to the 1976-2009 time span.
I cannot do the same kind of modified analysis on the 1980-2009 data that I did with the total, lower-48 or NGL production data, because there is not enough data before 1980 to make reasonable estimate of “a” Qo and Q∞ for use in equation [9].
I tried to look at a smaller data range of 1990-2009 using equation [9]; using the best fit of the Hubbert equation to the 1976-1989 time span for the fixed values of “a” Qo and Q∞. However neither varying fcq or fca gave significantly better fits than the fit to the fulll data range using the the Hubbert equation.
I think, however, that this finding is consistent with Hubbert equation fit to the measured values presented in Figure 8. The Hubbert Equation gives a fairly good fit to data on the decline side of the production curve, as say, compared to the lower 48-states or NGL measured production values (shown in Figures 4 and 6, respectively). There are some trends for the fit to systematically over-estimate production in the 1990s and then under-estimate production in the 2000s but these are small trends.
Note: the dashed line shows the back extrapolated production values using the best fit “a” Qo and Q∞ values (0.135, 2.52, 20.13, respectively) obtained from the best-fit to the 1976-2009 range. This amounts to fixing "a" and Q∞ to these best fit values and then adjusting Qo to the value it would have to have in 1948 to give the value Q would have in 1975, as predicted by the best-fit parameters. I used these back-extrapolated value when combining the production data from the three components so there would not be a discontinuity in the plot (Figure 9 below).
Putting it together: combining predicted lower 48-states, NGL and Alaskan production.
Figure 9 shows the sum of the individual best fits to the individual production data for the lower 48-states, NGL and Alaskan production. The NLLS best fits of the Hubbert equation to 1949-1979 data and then the modified best fit of equation [9] to 1980-2009 data and extrapolation into the future, are used as the best-fit models of the lower 48-states and NGL production data. The NLLS best fit using the Hubbert equation to 1976-2009 data and extrapolation into the future and past are used as the best model for the Alaskan production data.
For comparison, Figure 9 also shows two other best-fit models. The dash green line shows the NLLS best fit of the Hubbert equation to 1949-1979 total production data and then the modified best fit of equation [9] to the 1980-2009 total production data, and extrapolation into the future as previously described above in the context of Figure 2. The dashed red line shows the best fit to the total production data for 1949-2009 using the Hubbert equation, as described in the context of Figure 1.
Both of the modified NLLS analysis of the total production (dash green) and the sum of the three components parts of the production (solid green) are in pretty good agreement with each other throughout the data range, as well as extrapolated into the future until about 2020, where the predicted production rates start to diverge.
Does the modified fit to combined components provide a significantly better fit than the modified fit to the total production data? No, actually the Srss from the combined components production (1.568) is slight higher than the Srss from the analysis of the total production.
However, I think the analysis of the combined components does provide some additional useful insights as to why the rate of decline in total production is slowed down. The total production declined more slowly due to increases the amount of recoverable oil Q∞ in both the lower-48 states and in NGL (i.e., fcq>1).
The difference in production rates predicted for 2015 and 2020 by the Hubbert equation (1.76 and 1.45, respectively) are about 25% (1.76) and 50% (1.45) lower than the production rates predicted for 2015 and 2020 using the modified equation (2.20-2.26 and 2.06-2.21, for the total and combined analysis, respectively). And by 2030, while the Hubbert equation predicts a production rate of 0.95, the modified equation predicts 1.86 (analysis of total production) or 2.26 (combined three components of production). That’s an about a 100% difference!
This reminds me of about how far off the Hubbert equation when fit to only the 1949-1979 total production data, underestimates total production in 2009 (see Figure 5 and discussion).
This reminds me of about how far off the Hubbert equation when fit to only the 1949-1979 total production data, underestimates total production in 2009 (see Figure 5 and discussion).
Non-linear least squares (NLLS) analysis of total consumption
Figure 10 shows the best fit to USA petroleum consumption data (which I assume is equal to petroleum products supplied, in EIA Table 5.1). I show the best-fit to the full data set 1949-2009 and to a more limited data set 1984-2009, which I consider to give a more realistic view of future trends, because it does not include the rapid increase and then decrease in consumption in the 1970s and early 1980s.
Table 2 summarizes the best fit Srss, “a”, Qo and Q∞ for these two time ranges.
The plots and values in Figure 10 and Table 2 are obviously projections based on the past behavior, and I am not expecting that USA consumption can continue as it has in the past. It is useful, however, for showing what the future demand for petroleum products might be in the absence of a disruption in supply. The rate constant for USA consumption, whether we look at the last 60 years, or the last 15 year, are consistent with an increase in the rate constant of consumption (“a”) at about 4 percent per year. If the trend in consumption were to continue unabated, the USA would ultimately consume about 700-750 bbs of petroleum products in total (Q∞).
It is interesting that about the last 6 years of data from 2004-2009, show the signs of a plateau and roll over to the decline side of the consumption curve. This is likely reflective of the downturn in the economy as well as demographic trends in the USA . Time will tell if the decline in consumption shows a very sharp drop off as suggest by the last two years of data, or, a more gradual decline as suggested by the best-fit curve to the 1984-2009 data set.
Comparing future trends is USA petroleum production and consumption
Figure 11 compares the best fits obtained using my modified analysis of the total USA production data and the best fit to the 1984-2009 consumption data.
Although the modified Hubbert equation predicts a slower decline in USA ’s production than predicted using the traditional Hubbert equation, and consumption is predicted to peak and then decline, the discrepancy between what the USA is predicted to produce and consume for 2010 to 2030 is enormous. The projected curves suggest that the maximum difference in consumption and production will occur in about 2016. But, by 2018, the USA's production is predicted to be down to about the same level it was at in 1949. If present trends continue, the USA's consumption is predicted to remain about three times higher than its internal production for the next 20 years. Therefore for the foreseeable future, the USA will continue to have to rely on getting about two-thirds of its petroleum from international sources.
Who will be able to supply that amount of oil in the coming decade? Or, is theUSA in for a dramatic, forced decline in consumption? In future posts I will examine the prospects of the USA getting oil from its present international sources.
Who will be able to supply that amount of oil in the coming decade? Or, is the