An introduction to regression.[ILLUSTRATION OMITTED] Sir Francis Galton (1822--1911) studied medicine at Cambridge but when his father died in 1844 he no longer needed to work so he embarked on a tour of the Nile. After some exploration of Namibia he gave up travelling and settled to a life of science. As a psychologist he introduced the idea of a survey to collect data and was the first to promote the study of twins. His use of maps to show high air pressure areas led to the development of scientific weather forecasting weather forecasting Prediction of the weather through application of the principles of physics and meteorology. Weather forecasting predicts atmospheric phenomena and changes on the Earth's surface caused by atmospheric conditions (snow and ice cover, storm tides, floods, . An experiment with breeding sweet peas sweet pea, annual climbing plant (Lathyrus odoratus) of the family Leguminosae (pulse family), a legume native to S Europe but, since its introduction to horticulture c.1700, widely cultivated for its fragrant flowers. inspired Galton to think up the idea of regression analysis In statistics, a mathematical method of modeling the relationships among three or more variables. It is used to predict the value of one variable given the values of the others. For example, a model might estimate sales based on age and gender. in the 1870s and of statistical correlation in 1888. Using these statistical tools he was able to convince Scotland Yard Scotland Yard, headquarters of the London Metropolitan Police. The term is often used, popularly, to refer to one branch, the Criminal Investigation Department (CID). Named after a short street in London, the site of a palace used in the 12th cent. of the benefit of using fingerprints to identify people. In later years he became one of the first to apply the evolutionary theories
condensed con·dense v. con·densed, con·dens·ing, con·dens·es v.tr. 1. To reduce the volume or compass of. 2. To make more concise; abridge or shorten. 3. Physics a. from the Wikipedia An introduction to regression When I first used VisiCalc, I thought it a very useful tool when I had the formulas, but how could I design a spreadsheet if there was no known formula for the quantities I was trying to predict? A few months later I learned to use multiple linear regression Linear regression A statistical technique for fitting a straight line to a set of data points. software and suddenly it all clicked into place: all I needed was a data sample and the regression software would give me a formula and some idea of the limits of accuracy. Spreadsheets and regression both existed long before computers, but they became much more powerful tools in their computer form. While some topics in mathematics appeal to our sense of elegance there are others, like regression, that grab our attention because of their utility. Respect for elegance or utility are reactions that grow from having understood a topic, but they do not help to introduce it. When introducing a new topic, we need a variety of activities in the hope of catching the interest of a corresponding variety of learning styles. I try to include something of the history of the people who first explored the topic. The story of Sir Francis Galton's contribution to science and statistics leaves little doubt that his development of statistics arose from many practical and innovative pursuits. Students who think visually are often helped by the demonstration to be found at www.dynamicgeometry.com/javasketchpad/ gallery/pages/least_squares.php. This demonstration was developed by Bill Finzer and is included on the Geometer's SketchPad Sketchpad - A program that allowed users to draw on a screen with a light pen. It supported constraints (e.g. drawing a constrained ellipse produced a circle). It also had some computer aided design features (e.g. computing loads on beams). site as one of several examples of how a JavaSketchPad model can be built into a web page. The screen dump See screen capture. on the next page has had to be simplified from the highly coloured, dynamic version that you can find at the website. [ILLUSTRATION OMITTED] The data points are labelled P(1), P(2) ... and the task is to move the oblique line (Geom.) a line that, meeting or tending to meet another, makes oblique angles with it. See also: Oblique until the sum of squares of the distances between the data points and the line is minimised. The large square at the bottom right-hand corner has an area equal to the sum of the areas of the smaller squares. By moving the coloured dots labelled slope and y-intercept, the gradient and height of the regression line Noun 1. regression line - a smooth curve fitted to the set of paired data in regression analysis; for linear regression the curve is a straight line regression curve can be changed. All you have to do is keep fiddling until the total area has its least value. This is not a trivial task. This activity may be all that some students will need to develop sufficient confidence in what is happening when their calculator fits a regression line to a set of data points. This is something of a black-box approach in which we do not know how it works but we do not care anyway. Other students need to know how the demonstration works. The code is all there in the public domain: just click on View and select Source. The instruction set is in the page header Common text that is printed at the top of every page. It generally includes the page number and headings above each column. and begins: {1} Point(359,246)[hidden]; {2} Point(356,35)[label('P(6)')]; {3} Point(311,63)[label('P(5)')]; {4} Point(252,82)[label('P(4)')]; followed by another 84 lines. Each line is easy to follow, but the whole construction is complex. After all, this is a demonstration piece. If your students have already learned to use JavaSketchPad, you could let them satisfy their curiosity by playing with the construction and making minor alterations so that they can see what each section of the code is doing. However, while you may be justified in teaching JavaSketchPad to a geometry class, you may not wish to invest that much time with a statistics class. On the other hand, you can reasonably expect that they might try a cut-down version on their own graphics calculator. The following example works well on a ClassPad. Plot and constrain con·strain tr.v. con·strained, con·strain·ing, con·strains 1. To compel by physical, moral, or circumstantial force; oblige: felt constrained to object. See Synonyms at force. 2. the points A (--2,4), B (2,1), C (4,--4) and D (0,0) as shown below. [ILLUSTRATION OMITTED] Plot the points E (3,0) and F (0,4). Construct and constrain the axes DE and DF and hide E and F. Check that you now have the nondynamic parts firmly fixed in place. Define a point G on the y-axis DF. Plot a point H at (--3,3). Draw the future regression line GH. If you move the point G, you will change the y-intercept, and you can change the slope by moving the point H. Highlight the x-axis DE as well as the point A and construct a perpendicular line. Point A and the line DE are constrained con·strain tr.v. con·strained, con·strain·ing, con·strains 1. To compel by physical, moral, or circumstantial force; oblige: felt constrained to object. See Synonyms at force. 2. , so is the perpendicular line. Place similar lines through B and C perpendicular to DE. Identify the intersections of the new lines with GH as the points I, J and K as shown below. As you move either of the points G or H you will see that the points I, J and K are constrained to follow the tramtracks AI, BJ and CK. Select the tramtracks and hide them. Check that I, J and K are still constrained to the vertical lines through the points A, B and C. The line of best fit is found when the points G and H are moved such that the expression ([AI.sup.2] + [BJ.sup.2] + [CK.sup.2)] is minimised. To illustrate this we will build little squares on each of the line segments AI, BJ and CK. [ILLUSTRATION OMITTED] As mentioned in a previous article, it is not easy to construct stable quadrilaterals that will withstand manipulations. In this case we constrain the angles IAL IAL - ALGOL 58 and AIM to be 90[degrees] and set the slope of LM to [infinity] so as to form a rectangle. Make AI and AL equal, thus forcing AIML (AI Markup Language) An extension to XML used for artificial intelligence (AI) applications. See ALICE. to be a square. Repeat this procedure with BJON and CKQP. We should now have three squares that change size as we change the positions of G and H. Select the three points A, L and M. At the left-hand end of the measurement bar, select the Area icon and, for this example, the area of ALMI ALMI Application Level Multicast Infrastructure ALMI Automated Listing and Mapping Instrument (US Census Bureau) is about 1.01 unit (2). Tap on the area measurement and drag it toward the bottom of the work area. This will leave the title "Area:" in the Measurement Window. You can now edit the word to a more appropriate description. Just change it to read "Area A:". Tap the tick. Then repeat for the other two squares like this. [ILLUSTRATION OMITTED] From the Draw Menu choose Expression. Each of the previous measurements is now numbered in a small box. Tap on the first box and @1 appears in the Measurement Bar. Type "+". Then tap on the second and you have @1+@2 in the Measurement Bar. Keep going until you have @1+@2+@3 and then tap the tick. You now have a total area to slide down under the other area measurements. [ILLUSTRATION OMITTED] All the students have left to do is move the point G to different places along the y-axis and move H to different places to change the slope; they should be able to get the expression for the total area close to 1.75 as shown below. By choosing the initial three points A, B and C carefully, I have ensured rational coefficients for the regression line which can be viewed in the measurement bar. [ILLUSTRATION OMITTED] You can now show students how to use the regression software which is built into the spreadsheet to obtain the same answer much more quickly. Simple geometric examples like this assist visual thinkers to build a helpful dynamic model of how the regression line is determined. Hartley Hyde * cactus cactus, any plant of the family Cactaceae, a large group of succulents found almost entirely in the New World. A cactus plant is conspicuous for its fleshy green stem, which performs the functions of leaves (commonly insignificant or absent), and for the spines (not .pages@internode in·ter·node n. 1. A section or part between two nodes. 2. An internodal segment. in .on.net |
|
||||||||||||||||||||

Printer friendly
Cite/link
Email
Feedback
Reader Opinion