Monday 13 November 2017

Bioassay97: a new EXCEL VBA macro to perform statistical analyses on pesticide dose-response data

From the beginning of my activity I felt the need of developing simple software tools that would enable users with a limited background in statistics and computer programming (mainly students, practitioners and technicians) to perform sound statistical analyses directly from within the most widespread spreadsheet, with no additional cost requirements. The set of routines that is hereby presented has been progressively developed over the years, to support the intensive field research activity that is carried out at my institution and it was specifically tailored to the problems posed by routine agriculture experiments.

Download the macro

If you are a lazy reader and want to download the macro straight away, follow this link here. This is usually the latest version. You can also find an example spreadsheet, to test the macro at this link.

References

I spent some of my spare time developing this macro. If you find it useful, I would greatly appreciate if you could cite my work, by referring to one of these papers:

  1. Onofri A., 2005. Bioassay97: a new excel® vba macro to perform statistical analyses on herbicide dose-response data. Italian Journal of Agrometeorology, 3, 40–45. (PFD HERE)
  2. Onofri, A., Pannacci, E. (2014). Spreadsheet tools for biometry classes in crop science programmes. Communications in Biometry and Crop Science 9 (2), 43–53 (PDF HERE).

Usage

For a smooth usage experience, please note the following steps:

  1. download the file Bioassay97xp.XLS and put it everywhere on the hard-disk, or external devices (pen-drives, removable media…). Please do not rename it! After done this, you might like to create a shortcut on your desktop to lunch the macro more easily;
  2. make sure that the SOLVER add-in is installed and enabled in Excel;
  3. make also sure that macros are enabled in EXCEL;
  4. make sure that programmatic access to the Office VBA project is not denied.

If one of the above items is not ok, BIOASSAY97 will not work. In this case, take the following actions.

How to enable the SOLVER add-in

The add-in macro SOLVER.XLAM is included by default in Excel but kept disabled. In order to enable it, click the FILE Menu and choose OPTIONS. In the Excel Options dialogue box, select ADD-INS from the left sidebar and then hit the GO button, next to Manage Excel Add-ins at the bottom. Select the SOLVER ADD-IN option and hit OK to enable it. Once you’ve enabled the Solver, this will be available under the Data tab on the Data ribbon in Excel.

How to enable macros in Excel

To enable macros, you need to follow a version-dependent procedure. In Excel 2010–2016 : click the FILE MENU and select Options from the left sidebar. Select TRUST CENTER and press TRUST CENTER SETTINGS. In Trust Center Settings, select MACRO SETTINGS from the left sidebar, choose DISABLE ALL MACROS WITH NOTIFICATION and hit the OK button. Save and close Excel completely. In Excel 2007 : click the EXCEL OPTION button in the lower right. Click the TRUST CENTER button on the left. Then, at the bottom right, select TRUST CENTER SETTINGS. In the next window, select MACRO SETTINGS, then select the radio button for DISABLE ALL MACROS WITH NOTIFICATION. To close the Trust Center window, click the lower right OK button. Save and close Excel completely and reopen Excel.

How to enable programmatic access to the Office VBA project

In Excel 2010–2016 : click the FILE MENU and select Options from the left sidebar. Select TRUST CENTER and press TRUST CENTER SETTINGS. In Trust Center Settings, select TRUST ACCESS TO THE VBA PROJECT OBJECT MODEL and hit the OK button. Save and close Excel completely.

How to launch the macro

Open Bioassay97xp.XLS by double-clicking on it. Two things should happen: (1) a SECURITY WARNING pop-up notification appears, beneath the Office ribbon. Click the “Options” button and Select the radio box beside “Enable this content,” then click “OK.” You will have to do this every time, unless you selected the radio button for “Enable all macros” in the window at stage 2 (do so only if you know what you are doing!).

TROUBLE SHOOTING #1: NOTHING OF THE ABOVE HAPPENED!!!

This is because you have downloaded the macro from the Internet. Please, follow these steps: (1) from VIEW, click on UNHIDE and select BIOASSAY97 from the form that appears. (2) A service sheet will appear and, at the same time, you will se a notification pop-up appearing beneath the ribbon. Click on the appropriate button to enable the use of files downloaded from the Internet. (3) If prompted to do so, enable macros, as shown above. (4) Hide the macro (VIEW/HIDE), close Excel and save Bioassay97xp.XLS. This latter step is fundamental, to prevent this same problem from arising in the future.

Now click on the “Add-in” menu on the ribbon: you’ll notice that one new entry is added to the tools menu of Excel (Biologic Assay), which can be used to start the analyses.

TROUBLESHOOTING #2: I GET AN ERROR MESSAGE

An Excel data sheet must be opened, before clicking on any of the above buttons. Otherwise, you’ll get an error message (‘run-time error ‘91’: Object variable or With block variable not set’)!

Setting the analysis

Experimental data have to be organised in a database, with observations in rows, and variables in columns. Basically, at least a dose column and a response column are needed, but a categoric variable might be required as well if several response curves have to be simoultaneously fitted. The first row is reserved for variable names.

Look at this link to see an exemplary dataset. Read ahead if you want detail.


Introduction

Nonlinear regression analysis has been recognised as a very appropriate tool to analyse dose-response studies, involving pesticide antagonism, synergism, selectivity and resistance, as well as the effect of safeners, adjuvants and environmental side effects of pesticides. Advantages over the other techniques involving the linearisation of data (especially probit and logit analysis) are in that these latter methods have usually limited biological meaning and do not always allow for an appropriate description of data in certain response ranges, such as at extremely low and high doses.

Almost all the statistical aspects of non-linear regression analysis on bioassay data have been reviewed and reconsidered after the massive introduction of personal computers and this technique has considerably spread in the past few years, especially in weed science. A list of references has been given later on.

However, several authors insist on analysing dose-response data by linearisation or even by multiple comparison tests. Likely, a further spreading of non-linear regression techniques is restrained by the fact that calculations, though made easier by commercially available statistical packages, still require a certain degree of statistical expertise. Indeed, non-linear regression routines found in those statistical packages have not been specifically developed to analyse bioassay data: certain routine tasks, such as the calculation of ED levels with confidence limits or the comparison among different dose-response curves, are not easily performed without a certain programming effort.

The aim of this work was to develop an EXCEL VBA macro, specifically thought to deal with dose-response curve analyses, which would enable users with a limited background in statistics to automatically perform log-logistic analyses, directly inside the most common spreadsheet.

All the rationale and theory behind this macro have been taken by Streibig et al. (1993 a and b) and all the tools have been included to carry out appropriate analyses on the main part of bioassays regarding the study of factors influencing pesticide performances.


Single curve analysis

A single curve analysis is usually performed whenever it is necessary to estimate ED-levels for a single dose-response curve.

Model selection

A response model, a dose variable and a response dose can be selected from drop-down menus. Concerning models, three selections are possible: (1) Log - logistic model (sigmoidal symmetric responses on log-dose); (2) Weibull model (logistic asymmetric responses) (Streibig et al., 1993); (3) Peaked model (Brain and Cousens, 1989; logistic responses with stimulation at low doses).

See the earlier cited papers for a detailed description of each single model and of biological meaning of parameters. Indeed, a log-logistic model would fit in most of the cases, while a Weibull model should be selected in case of clearly asymmetric responses. The peaked model is suitable in case of response stimulation at low doses, but its usage should be carefully evaluated, as its mathematical properties are not as good as those of the other two sigmoidal models.

Model and variables choice
Model and variables choice

Increasing or decreasing curves

Each model can be taken to handle both increasing and decreasing curves, such as those based on growth of test-species (that is expected to decrease, as dose increases) or on growth inhibition (that is expected to increase, as dose increases). The macro itself will decide whether an increasing or a decreasing curve is required and will provide itself for the necessary adjustments on the model.

Constraints

Constraints can be put on the lower and/or on the upper asymptotes. This might be necessary to reach convergence or to improve the estimates of the other parameters. In particular, constraints might be needed: (1) for biological reasons; (2) whenever the lower asymptote is not significantly different from 0 or negative; (3) dealing with curves based on percentage pest control, which are often supposed to range from 0 to 100%; (4) asymptotes are measured or known without any experimental error.

Starting values for parameters

BIOASSAY97 does not provides itself starting values for parameters. Basically, the highest and the lowest observed response values can be used as the starting points for the higher and the lower asymptotes respectively. A starting value for the inflection point can be relatively easily estimated from observed data (just choose a dose value which gave a response approximately half way between the higher and the lower asymptote), while a value of 1 for the slope should be considered appropriate in most cases. Likewise, a value of 1 should be appropriate for the parameter describing stimulation, if needed.

If a constrained model has been chosen, the value to which the parameter has to be constrained must be entered here. For example, if the user wants to constrain the lower asymptote to 0, such a value has to be entered in the apposite input box.

Lambda values for the transformation of response

Data do not always meet the basic assumptions for regression analyses. Tipically, whenever the difference between the highest and the lowest observed response is higher than one order magnitude, variances will not be constant across all the treatments. Among the Box and Cox (1964) families of transformations, the following one has been chosen:

where W is the transformed variable, Y is the untransformed variable, l is the transformation parameter and is the geometric mean of the observations. By default l is set to 1 (no transformation); a value of 0,5 means that a square root transformation is performed, while a value of 0 means that a logarithmic transformation is performed. Often, a value of 0,25 has been found to be appropriate for dose-response analyses. A maximum likelihood value for lambda can be chosen by a direct comparison of RSS values obtained with different values of l (Box e Draper, 1987).

In order to obtain parameter estimates on the original scales, a Transform Both Sides technique has been adopted (according to Streibig et al., 1993).

ED levels and confidence limits

By default, the program itself will calculate the most useful ED levels, together with confidence limits. Indeed, in the case of decreasing curves, ED10, ED30 and ED50 are calculated, while in the case of increasing curves, ED50, ED70 and ED90 are calculated. Should an additional ED level be required together with confidence limits, the corresponding response level must be entered on the appropriate input box (for example if the ED85 level is required, enter 85 on the textbox).

Confidence limits are calculated by using the delta method. For the peaked curve of Brain and Cousens, confidence limits for ED-levels are calculated by using the inference band for the expected response, as shown by Bates and Watts (1988) and Snedecor and Cochran (1991).


Multiple curve analysis

While the previous analyses is normally performed to estimate ED-levels, simultaneous fitting of several curves is performed when dose-response curves need to be compared on a statistical basis.

To be able to simultaneously analyse several dose-response curves, at least three column have to be included on the spreadsheet: a dose variable, a response variable and a categoric variable, coding for the different curves.

Example of database selection, when different curves have to be simultaneously fitted to observed data. In this example the performances of two herbicides are gong to be compared.
Example of database selection, when different curves have to be simultaneously fitted to observed data. In this example the performances of two herbicides are gong to be compared.

In some cases, all the curves share the same untreated check (pesticide rate equals to zero); this may happen for example when comparing the performances of several pesticides. In this case, the untreated check might be assigned arbitrarily to one of the curves, but common higher asymptotes have to be chosen for the different curves later on.

The three variables (dose, response and category) have to be selected from drop-down menus on the apposite window.

Model selection

Two models can be selected: a logistic and a Weibull model. A peaked model has not been included, because a comparison of dose-response curves on the stimulation range is not normally carried out. If needed, stimulation can be masked to improve the precision of estimates.

After selecting the type of model, the user should state whether the different curves have common parameters. Following figure shows an example wherein the user has requested the macro to consider different logistic curves with common asymptotes and slopes (parallel curves).

A lambda value should also be provided here. See single curve analyses for more information on data transformation.

Example of model selection.
Example of model selection.

Starting values for parameters and constraints

After selecting the model, the user is prompted to enter, for each curve, starting values for parameters. In the case that lower or higher asymptotes have to be constrained to a specific value, the apposite option button is to be selected.

For example, the following figure shows a case in which the user has requested the macro to constrain the lower asymptote of curve 1 one to 0.

In the case of multiple curves analysis, the macro does not provide starting values for parameters, but it just memorise the lastly specified values. It has been assumed that the user has already carried out separate analyses for each curve and thus has already precise indications on possible starting values for parameters.

Selection of constraints on one curve
Selection of constraints on one curve

Iteration process and results

The iterative procedure for parameter estimation is carried out by the EXCEL Solver. If convergence cannot be reached, a warning message is displayed and the analyses is stopped. Very often this depend on overparameterisation and/or wrong starting values for parameters. A more careful selection of model and/or starting parameter values should improve the iterative process.

When convergence is reached, results are displayed on a new worksheet, that is added to the current workbook. All the statistics are displayed, together with graphs to allow for appropriate graphical analyses of residuals to be performed.

References and further reading

BATES, D.M., WATTS, D.G., 1988. Nonlinear regression analysis & its applications., John Wiley & Sons, Inc, New York. BOX, G.E.P., COX, D.R., 1964. An analysis of transformations. Journal of the Royal Statistical Society, B–26, 211–243. BRAIN, P., COUSENS, R., 1989. An equation to describe dose responses where there is stimulation of growth at low doses. Weed Research 29, 93–96. DRAPER, N.R., SMITH, H., 1981. Applied regression. John Wiley & Sons Inc. FINNEY, D.J., 1979. Bioassay and the practice of statistical inference. International Statistical Reviews, 47, 1–12. STREIBIG, J.C., RUDEMO, M., JENSEN, J.E., 1993a. Dose-response curves and statistical methods. In “Herbicide bioassays”, ed. Streibig J.C. e Kudsk P., Boca Raton, 29–55. STREIBIG, J.C., 1988. Herbicide bioassay. Weed Research 28, 479–484. STREIBIG, J.C., JENSEN, J.E., OLOFSDOTTER, M., HAAS, H., ANDREASEN, C., LAWAETZ, E., 1993. Testing hypotheses with dose-response curves. Proc. 8th Symposium “Quantitative approaches in weed and herbicide research and their practical application”, Braunschweig, 423–431.

Aknowledgements

The author wishes to thank Euro Pannacci (DSAA, University of Perugia), Ivan Sartorato (IBAF, CNR Padova), Albert Fisher (University of California) for testing the macro and providing useful comments and suggestions.