Facilitating sponsor requests: a relational database model for project and effort reporting.Background
Department research administrators (DA) coordinate communication between principal investigators Noun 1. principal investigator - the scientist in charge of an experiment or research project
scientist - a person with advanced knowledge of one or more sciences requesting extramural extramural /ex·tra·mu·ral/ (-mur´il) situated or occurring outside the wall of an organ or structure.
situated or occurring outside the wall of an organ or structure. funding and the funding agencies that provide it. When a sponsor decides to fund an investigator-initiated application, the principal investigator is asked to address issues such as (a) current and pending support for the investigators, (b) the institution's review and approval processes for sponsored research, and (c) the ability of the researchers identified in the application to complete the research program, given their other commitments. Sponsor requests are typically forwarded to the DA, who has the responsibility for compiling com·pile
tr.v. com·piled, com·pil·ing, com·piles
1. To gather into a single book.
2. To put together or compose from materials gathered from several sources: the information and obtaining institutional endorsement before submitting it to the sponsor.
Sponsor requests can overwhelm o·ver·whelm
tr.v. o·ver·whelmed, o·ver·whelm·ing, o·ver·whelms
1. To surge over and submerge; engulf: waves overwhelming the rocky shoreline.
a. even experienced DAs. To facilitate sponsor requests, data must often be compiled from multi-project research programs and individual research projects that are administered in outside departments. Institutional information systems often do not serve the needs of department-level administrators, and many can neither anticipate renewals or extensions of active projects, nor record changes in effort commitments by investigators. Due to the shared responsibility between institutions and departments for ensuring compliance, DAs may not have all of the information to address sponsor concerns. Landen Landen is a municipality located in the Belgian province of Flemish Brabant. The municipality comprises the city of Landen proper and the towns of Attenhoven, Eliksem, Ezemaal, Laar, Neerlanden, Neerwinden, Overwinden, Rumsdorp, Waasmont, Walsbets, Walshoutem, Wange and Wezeren. and McCallister (2002) reinforce the DA's share in research compliance, thus reinforcing the need for a department-based database system. "Compliance, both financial and non-financial, is the growth industry in research administration, with the responsibility for compliance being integrated into the jobs of central and departmental administrators." (p. 18)
Baker and Wohlpart (1998) summarized information from a survey of college and university chief research officers conducted in July July: see month. 1996 and concluded that "Overall, the most popular grant management systems are manual records and files and modified computer database systems, with over half of the responding institutions indicating these types of systems." (p. 38) They further concluded that of 30 survey respondents In the context of marketing research, a representative sample drawn from a larger population of people from whom information is collected and used to develop or confirm marketing strategy. from public and private R1 research institutions (those that offer the full range of baccalaureate programs, award 50 or more doctoral degrees annually, and receive $40 million or more in federal support), "the most popular grant management system used by R1 institutions is a homegrown home·grown
1. Raised or grown at home.
2. Originating in or characteristic of a locality: "Rock is homegrown music in the United States, evolved from blues and country and Tin Pan Alley" computer database, with 60 percent of the 30 respondents." As these homegrown systems have likely been designed for their respective central research offices, one could infer that the institutional reports provided to the DA would be inadequate for addressing sponsor requests.
The DA is positioned to obtain other support for current and pending projects as well as changes in effort commitments. When an individual project is conducted as part of a larger multi-project program, DAs are familiar with their supported investigator's award portion. While electronic spreadsheets The following is a list of spreadsheets. Freeware/open source software
Database in which all data are represented in tabular form. The description of a particular entity is provided by the set of its attribute values, stored as one row or record of the table, called a tuple. tools that maintain project-related information about investigators would be more useful. This paper describes the characteristics of a relational database, explains the benefits of a relational database model, and provides a blueprint blueprint, white-on-blue photographic print, commonly of a working drawing used during building or manufacturing. The plan is first drawn to scale on a special paper or tracing cloth through which light can penetrate. for designing a relational database model that can be employed to enhance data retention and reporting capability.
Characteristics of a Relational Database
Relational databases consist of at least two separate, but related sets of data. These separate sets of data are typically referred to as Tables. Frye explained, "The most basic object in a database is the table, where you store your data. You might be tempted to jam every type of data you want to store in a single table, but that's hardly ever the right way to design tables in a database." (2001, p.32) Tables are containers for storing similar types of data, and consist of rows and columns. Tables are defined by the fields or data types they contain. Fields store individual cells of data at the intersection intersection /in·ter·sec·tion/ (-sek´shun) a site at which one structure crosses another.
a site at which one structure crosses another. of each row and column inside a database table. Each field of data should be assessed for its appropriateness to the table prior to establishing relationships between key fields in the individual tables. A database record includes all data fields along the same horizontal row within the database table.
As Frye referred to above, understanding relational databases requires overcoming paradigmatic See paradigm. processes of storing all data components in the same place. The relational database concept does not suggest that it is more practical to retrieve similar data from more than one place, but that tables of data can be simultaneously accessible when each shares a related field of data. Although this paper focuses on the creation of a sponsored project and investigator effort reporting database, changing from any spreadsheet spreadsheet
Computer software that allows the user to enter columns and rows of numbers in a ledgerlike format. Any cell of the ledger may contain either data or a formula that describes the value that should be inserted therein based on the values in other cells. or word processing word processing, use of a computer program or a dedicated hardware and software package to write, edit, format, and print a document. Text is most commonly entered using a keyboard similar to a typewriter's, although handwritten input (see pen-based computer) and data retention mechanism to a relational database model requires an understanding of how data collections are related to one another.
Relationship Determination: Program or Project?
The first rational step toward creating relationships between sponsored project data is the characterization A rather long and fancy word for analyzing a system or process and measuring its "characteristics." For example, a Web characterization would yield the number of current sites on the Web, types of sites, annual growth, etc. of research programs as separate entities from individual research projects. These entities should be stored in separate database tables and connected or joined by related fields of data. The field of data on which relationships between database tables are based is commonly referred to as a table's key. Each database table should be created to store information about only one type of object, and should possess at least one key field of data from which relationships to other tables can be based.
An individual research project (RP) may be conducted over several years. Since each year of a RP would have a different award period, it is logical to enter an individual record for each year of the RP into the same database table. A separate database table would store a record to represent the RP as a whole. These two tables would share a related key field of data such as the principal investigator's name or the grant number.
Multi-project programs (MP) provide conclusive Determinative; beyond dispute or question. That which is conclusive is manifest, clear, or obvious. It is a legal inference made so peremptorily that it cannot be overthrown or contradicted. justification for using a relational database model. MP's generally consist of two or more RPs. Like a RP, each project component of a MP may be conducted over several years. Individual records for each year of every component project of a MP should be stored in the same database table with individual records from RPs. The term [Projects] refers to this database table in the text and figures that follow. A separate record representing the MP as a whole would be stored in the same database table with the record representing the RP as a whole. The term [Programs] is used in the following text and figures to refer to this database table.
To establish the relationship between [Programs] and [Projects], a key field of related data should be identified in both. The sponsor grant number is an excellent choice for the key field, as it could be shared by each RP record within a MP record. This will also allow the database user to simultaneously enter and retrieve some or all of a program's records based on an identical field of data. Database construction should consider that a new grant application in its pre-award stage may not possess a sponsor grant number. This obstacle is easily overcome by inserting temporary assignment numbers into the [Programs] and its related [Projects] records. These temporary numbers can be replaced with the actual grant assignment number when it becomes available.
Grant Applications and Awards: Data Collection and Retention
Information pertaining per·tain
intr.v. per·tained, per·tain·ing, per·tains
1. To have reference; relate: evidence that pertains to the accident.
2. to sponsored programs or projects would be maintained when there is current or planned participation of an investigator within a DA's scope of responsibility. Decisions about other types of data that will be collected and retained in the database should be made prior to constructing a database. These decisions will determine the amount of time and effort that will be required to maintain the database. Databases should be designed to simplify the process of entering and retrieving data, allowing the DA to quickly populate To plug in chips or components into a printed circuit board. A fully populated board is one that contains all the devices it can hold. the underlying database tables, while cataloging database records by investigator, application type, or potential funding category. DAs should assess the award data already maintained at the institution and consider the capability of the database design to retain data for awards made from outside their institution when the information may be fragmentary frag·men·tar·y
Consisting of small, disconnected parts: a picture that emerges from fragmentary information.
frag . Since an award for a program or project could meet several conditions, the database design should be able to anticipate all of them. This will prevent database design modifications otherwise necessary should an unanticipated condition arise.
The following list provides several conditions applicable to a particular source of funds:
1. Individual Research Grant--awarded to administrator's institution and department
2. Individual Research Grant--awarded to administrator's institution but outside department
3. Individual Research Grant--subcontracted to administrator's department from outside institution
4. Multi-Project Program--awarded to administrator's institution and department
5. Multi-Project Program--awarded to administrator's institution but outside department
6. Individual Component of Multi-Project Grant--awarded to administrator's institution and department.
7. Individual Component of Multi-Project Grant--awarded to administrator's institution but outside department.
From the list above, DAs would have immediate access to detailed information about items 1, 3, 4, and 6. Detailed award information pertaining to items 2, 5, and 7 are likely not as accessible since the applications were not initiated in the administrator's department. Although the occurrence of certain award categories within a DA's area of responsibility might be infrequent in·fre·quent
1. Not occurring regularly; occasional or rare: an infrequent guest.
2. , decisions to exclude any relative information will restrict the capability of the database to produce accurate reports.
Thus far, this paper has described a DA's need for a database, provided characteristics of a relational database, explained segregation segregation: see apartheid; integration. of data and creation of relationships between tables of data within a database, and has provided some tips on assessing award types to avoid common pitfalls encountered in database design and data maintenance. The remainder of this paper will be devoted to describing a relational database model for sponsored projects and effort reporting. The model described is useful for the management of sponsored project application and award data from most any federal or nonfederal agencies. The fields within certain database tables in this model are structured to mirror award assignment data from the Information for Management, Planning, Analysis, and Coordination (IMPAC IMPAC International Merchant Purchase Authorization Card
IMPAC Intersegmental Major Preparation Articulated Curriculum
IMPAC Information for Management, Planning, Analysis, and Coordination (National Institutes of Health) ) and Computer Retrieval of Information on Scientific Projects (CRISP) systems to identify various aspects of extramural activities of the Department of Health and Human Services Noun 1. Department of Health and Human Services - the United States federal department that administers all federal programs dealing with health and welfare; created in 1979
Health and Human Services, HHS (DHHS DHHS Department of Health & Human Services (US government)
DHHS Dana Hills High School (Dana Point, California)
DHHS Deaf and Hard of Hearing Services
DHHS Deaf and Hard of Hearing Services ).
Database Modeling for Sponsor Reporting
As is the case with most sponsors, institutes at the National Institutes of Health (NIH "Not invented here." See digispeak.
NIH - The United States National Institutes of Health. ) commonly request information about the financial support and effort commitments of investigators. The format requested for this correspondence is the NIH-formatted "Other Support" page. Figure 1 demonstrates a responsive NIH-formatted "Other Support" page for Dr. John Doe John Doe
formerly, any plaintiff; now just anybody. [Am. Pop. Usage: Brewer Dictionary, 329]
See : Everyman . Dr. Doe DOE - Distributed Object Environment: a distributed object-oriented application framework from SunSoft. serves as Principal Investigator on two individual research projects. He also participates as a Co-Investigator on a multi-project P01 grant from the National Heart, Lung and Blood Institute.
[FIGURE 1 OMITTED]
Although the NIH formatted "Other Support" page (Figure 1) could be produced with an electronic spreadsheet or word processor, it is unlikely that an institution would warehouse the data in the displayed format. Since Dr. Doe is not the Principal Investigator of the multi-project grant (P01 HL003456), information about this MP is more likely to be stored by the institution under the name of Dr. J Noun 1. Dr. J - United States basketball forward (born in 1950)
Erving, Julius Erving, Julius Winfield Erving . Moss (the Program Director of P01 HL003456). If an institution chose to store program and project related data in the Figure 1 format, redundant storage of information regarding the multi-project program (P01 HL003456) would be necessary for all participating investigators. According to according to
1. As stated or indicated by; on the authority of: according to historians.
2. In keeping with: according to instructions.
3. Jennings Jennings, city (1990 pop. 11,305), seat of Jefferson Davis parish, SW La., on the Mermentau River; inc. 1888. Cotton and rice are grown, there is a bottling plant, and drugs, machinery, apparel, and water-treatment systems are manufactured. , "Because relational databases eliminate most duplicate DUPLICATE. The double of anything.
2. It is usually applied to agreements, letters, receipts, and the like, when two originals are made of either of them. Each copy has the same effect. information, they minimize data storage and application memory requirements." (1997, p. 118)
[FIGURE 1 OMITTED]
As Figure 1 suggests, a relationship exists between Dr. Doe and the sponsored program (P01 HL003456), although he is not the Program Director of the MP. This relationship is modeled in relational database format in Figure 2.
[FIGURE 2 OMITTED]
Figure 2 demonstrates two tables within a relational database. The [Programs] table holds data fields pertaining to the MPs and the [Projects] table holds data fields of data pertaining to the RPs. The table relationship is demonstrated by a connecting line between the (ProgID) field in [Programs] and [Projects]. The tables share the same unique identifier With reference to a given (possibly implicit) set of objects, a unique identifier is any identifier which is guaranteed to be unique among all identifiers used for those objects and for a specific purpose. , namely the Program Identifier (ProgID) field. The "I" seen just above the connecting line closest to [Programs] indicates that only one record containing the same (ProgID) can exist in the [Programs] table. The "[infinity infinity, in mathematics, that which is not finite. A sequence of numbers, a1, a2, a3, … , is said to "approach infinity" if the numbers eventually become arbitrarily large, i.e. ]" seen just above the connecting line closest to [Projects] indicates that many records bearing the same (ProgID) can exist in the [Projects] table. In database terminology, this relationship is referred to a one-to-many One-to-many in communication is the act of publishing or broadcasting from one sender to many receivers.
One-to-many (also known as "to-many") relationships are often used when managing databases. relationship.
The one-to-many relationship between [Programs] and [Projects] eliminates the need for redundant entries of program-specific information. A single database form can be used to enter data into both tables simultaneously. Likewise, a single database query can simultaneously retrieve data from both database tables to produce a database report. Although detailed procedures for the creation of database forms, queries, and reports are beyond the scope of this paper, production of a report similar to Figure 1 will require a working knowledge of each. The primary objective is to understand that relationships between database tables are what provide functionality to the database and the database user.
Relationships among four tables in a sponsored project database are graphically demonstrated in Figure 3. Two data tables, [Investigator] and [Effort], have been added to the database relationships originally demonstrated in Figure 2. [Investigator] stores the name (PIName) and unique number (PINumber) for investigators associated with current or pending applications and awards. A one-to-many relationship is established between the (PINumber) field of [Investigator] and the (PINumber) fields of both [Programs] and [Projects]. The one-to-many relationship allows a single entry of the investigator name (PIName) in [Investigator]. This model provides functionality for the database to append To add to the end of an existing structure. a unique number (PINumber) representing each participating investigator, into [Programs] and [Projects]. Similarly, a relationship has been created between the (ProjID) field of both [Projects] and [Effort]. This relationship enables multiple effort records to be stored in [Effort] and linked to their related record in [Projects]. This linking is performed via the project's unique project identifier (ProjID), independently of the participant's role on the program or project.
[FIGURE 3 OMITTED]
These relationships are modeled after the real-world fact that an Investigator could participate as (a) the Program Director of an MP, (b) the Principal Investigator or Co-Investigator of an RP, or (c) the Principal Investigator, Core Leader, or Co-Investigator of an RP being conducted within one or more years of an MP. The underlying datasets for [Programs] and [Projects] are demonstrated respectively in Figure 4 and Figure 5.
[FIGURES 4-5 OMITTED]
As demonstrated in Figures 4 and 5, the names of the Principal Investigator or Program Director are not found in the dataset See data set. of [Programs] or [Projects]; however, this information is available via the referenced (PINumber) field. Because relationships exist between the (PINumber) of [Investigator], [Programs], and [Projects], information about any MP or RP could be entered or retrieved simultaneously with information about participants, participant effort, and role of the participant on the RP. The underlying dataset of [Investigator] is demonstrated in Figure 6, and the underlying dataset of [Effort] is demonstrated in Figure 7.
[FIGURES 6-7 OMITTED]
As demonstrated in Figure 6, the investigator's name is stored only once in [Investigators]. The corresponding (PINumber) is assigned as·sign
tr.v. as·signed, as·sign·ing, as·signs
1. To set apart for a particular purpose; designate: assigned a day for the inspection.
2. automatically as the investigator's name is added to the [Investigator] table. The one-to-many relationship between [Investigator] and [Projects] requires the investigator's record to exist in [Investigator] before the investigator's (PINumber) can be added to [Projects].
Figure 7 displays the records contained in [Effort]. The one-to-many relationship between the (ProjID) in [Projects] and [Effort] requires the project's record to exist in [Projects] before a corresponding [Effort] record can be added. The relationship allows multiple investigator effort records to be stored in [Effort]. When multiple investigator effort records are added to [Effort] for the same [Project], the same project identifier (ProjID) is appended to each related effort record.
Although this paper does not provide detailed instruction on writing database queries, it should be understood that database queries are the engines for simultaneous retrieval of data from multiple database tables. For the purposes of this exercise, we define the query as a simultaneous request for multiple fields of data from each of four database tables--[Programs], [Projects], [Investigator], and [Effort].
Table 1 provides the database tables, fields, and specific conditions required in a query to produce the NIH formatted "Other Support" page originally demonstrated in Figure 1.
Extrapolated Benefit vs. Time Commitment
Although this database model was designed for its ability to produce the NIH formatted Other Support, the relationships defined in it can provide substantial additional benefits to the DA. The model demonstrated in this paper can be quickly deployed and provides substantial capability for the minimal effort required to maintain it. With only slight revisions to the query outlined in Table 1, reports could be generated to forecast future years of support for an investigator. The (StartDt) and (EndDt) fields found in [Projects] could be added to a query. By specifying date range criteria consistent with an NIH award year, DAs can forecast award levels for all NIH funding they administer. These same fields could be used in queries to forecast progress report due dates, or upcoming project expirations. Querying the database by the (Status) field can provide detailed listings for active, pending, or future committed awards for all investigators within a DA's area of responsibility. Specifying criteria in the (investigator) field can provide similar reports for a single investigator. Adding fields to the [Programs] or [Projects] tables to store institutional review board items such as animal care or human subject approval dates and protocol numbers would enable DAs to track protocol expirations for active projects. The possibilities are limited only by the amount of effort a DA wishes to commit to the database.
Relational databases are more effective than electronic spreadsheets or word processors for maintaining and reporting sponsored project and investigator effort information. DAs possess unique familiarity with their investigator's effort and award information. The process by which DAs must merge their knowledge with institutionally provided data is cumbersome cum·ber·some
1. Difficult to handle because of weight or bulk. See Synonyms at heavy.
2. Troublesome or onerous.
cum . The DA equipped with a relational database similar to the one described in this paper is better equipped to fulfill ful·fill also ful·fil
tr.v. ful·filled, ful·fill·ing, ful·fills also ful·fils
1. To bring into actuality; effect: fulfilled their promises.
2. sponsor requests. Automation of the NIH formatted "Other Support" page is only a single example of what can be accomplished with relational databases. It is my hope that this paper will encourage other administrators to implement relational database solutions that can expedite ex·pe·dite
tr.v. ex·pe·dit·ed, ex·pe·dit·ing, ex·pe·dites
1. To speed up the progress of; accelerate.
2. sponsor deliverables and automate To turn a set of manual steps into an operation that goes by itself. See automation. other routine data warehousing See data warehouse.
data warehousing - data warehouse needs they may have.
Table 1--Queried Tables, Fields, and Conditions Table Field Condition [Effort] (ParticipantName) Where = "Doe, John J." [Effort] (EffortPercent) None [Effort] (ProjectRole) None [Programs] (ProgAgency) None [Projects] (AppType) None [Programs] (ProgActCode) None [Programs] (ProgAdmOrg) None [Programs] (ProgSerialNumber) None [Projects] (ProjYr) None [Programs] (ProgStart) None [Programs] (ProgEnd) None [Investigator] (PINumber) Query without Display [Investigator] (PIName) Where [Investigator]![PINumber] = [Programs]![PINumber] [Projects] (Status) Where = "Active" or "Pending" [Projects] (DirCost) None [Projects] (MajorGoals) None
Landen, M. & McCallister, M. (2002). Evolution or Revolution? Examining Change in Research Administration. The Journal of Research Administration, (33)3, 17-20.
Baker, J.G. & Wohlpart, A. (1998). Research Administration in Colleges and Universities: Characteristics and Resources. Research Management Review, (10) 1, 33-55.
Frye, C. (2001). Microsoft (Microsoft Corporation, Redmond, WA, www.microsoft.com) The most successful and influential software company. Microsoft's software and Intel's hardware pioneered the PC and revolutionized the computer industry. [R] Access Version 2002 Plain & Simple. Redmond Redmond, city (1990 pop. 35,800), King co., W Wash., a suburb of Seattle, on Lake Sammamish; inc. 1912. Its economy centers around computer software (Microsoft Corp. , WA: Microsoft Press.
National Institutes of Health, (2003). Activity Codes, Organization Codes, and Definitions Used in Extramural Programs; IMPAC (Information for Management, Planning, Analysis, and Coordination) A Computer-Based Information System of the Extramural Programs of NIH/DHHS.
CRISP (Computer Retrieval of Information on Scientific Projects) Version 18.104.22.168; A Biomedical bi·o·med·i·cal
1. Of or relating to biomedicine.
2. Of, relating to, or involving biological, medical, and physical sciences. database system containing information on research projects and programs supported by the Department of Health and Human Services. <http://crisp.cit.nih.gov/>
Jennings, R. (1997). Que n. 1. A half farthing. [R] Using Access 97. Indianapolis Indianapolis (ĭn'dēənă`pəlĭs), city (1990 pop. 731,327), state capital and seat of Marion co., central Ind., on the White River; selected 1820 as the site of the state capital (which was moved there in 1825), inc. 1847. , IN: Macmillan Macmillan, river, c.200 mi (320 km) long, rising in two main forks in the Selwyn Mts., E Yukon Territory, Canada, and flowing generally W to the Pelly River. It was an important route to the gold fields from c.1890 to 1900. Computer Publishing USA.
Mark A. Hughes
Vanderbilt University Medical Center The Vanderbilt University Medical Center (VUMC) is a collection of several hospitals and clinics associated with Vanderbilt University in Nashville, Tennessee. It comprises the following units: