Simulation of sheared suspensions with a parallel implementation of QDPD.A parallel quaternion-based dissipative particle dynamics Dissipative particle dynamics (DPD) has become over the last decade a popular method for simulating dynamical and rheological properties of both simple and complex fluids. It is a stochastic simulation technique, which was initially devised by Hoogerbrugge and Koelman [1] (QDPD) program has been developed in Fortran to study the flow properties of complex fluids subject to shear shear: see strength of materials. Shear A straining action wherein applied forces produce a sliding or skewing type of deformation. . The parallelization allows for simulations of greater size and complexity and is accomplished with a parallel link-cell spatial (domain) decomposition decomposition /de·com·po·si·tion/ (de-kom?pah-zish´un) the separation of compound bodies into their constituent principles. de·com·po·si·tion n. 1. using MPI MPI - Message Passing Interface . The technique has novel features arising from the DPD DPD Department of Planning and Development DPD Dihydropyrimidine Dehydrogenase DPD Dead Peer Detection (Cisco) DPD Division of Parasitic Diseases (US CDC) DPD Dominant Wave Period DPD Drug Product Database formalism Formalism or Russian Formalism Russian school of literary criticism that flourished from 1914 to 1928. Making use of the linguistic theories of Ferdinand de Saussure, Formalists were concerned with what technical devices make a literary text literary, apart , the use of rigid body Rigid body An idealized extended solid whose size and shape are definitely fixed and remain unaltered when forces are applied. Treatment of the motion of a rigid body in terms of Newton's laws of motion leads to an understanding of certain important inclusions spread across processors, and a sheared sheared adj. Shaped or finished by shearing, especially cut or trimmed to a uniform length: a sheared fur coat. Adj. 1. boundary condition boundary condition n. Mathematics The set of conditions specified for behavior of the solution to a set of differential equations at the boundary of its domain. . A detailed discussion of our implementation is presented, along with results on two distributed memory (architecture) distributed memory - The kind of memory in a parallel processor where each processor has fast access to its own local memory and where to access another processor's memory it must send a message via the inter-processor network. Opposite: shared memory. architectures. A parallel speedup of 24.19 was obtained for a benchmark calculation on 27 processors of a distributed memory cluster. Key words: dissipative particle dynamics; domain decomposition; mesoscopic modeling; parallel algorithms In computer science, a parallel algorithm, as opposed to a traditional serial algorithm, is one which can be executed a piece at a time on many different processing devices, and then put back together again at the end to get the correct result. ; rheology; spatial decomposition; suspensions. ********** 1. Introduction Understanding the flow properties of complex fluids like suspensions (e.g., colloids, ceramic This article is about ceramic materials. For the fine art, see Ceramic art. The word ceramic is derived from the Greek word κεραμικός (keramikos). slurries, and concrete) is of importance to industry and presents a significant theoretical challenge. The computational modeling
adj. 1. Of or having to do with atoms or atomism. 2. Consisting of many separate, often disparate elements: an atomistic culture. particles move according to according to prep. 1. As stated or indicated by; on the authority of: according to historians. 2. In keeping with: according to instructions. 3. Newton's laws Noun 1. Newton's law - one of three basic laws of classical mechanics law of motion, Newton's law of motion law of nature, law - a generalization that describes recurring facts or events in nature; "the laws of thermodynamics" . However, the DPD "particles" are a mesoscopic description of the fluid, and do not represent individual atoms or molecules, but loosely correspond to "lumps" of fluid or clusters of molecules. As a result, the interactions between the DPD particles are not directly based on a Lennard-Jones potential Neutral atoms and molecules are subject to two distinct forces in the limit of large distance and short distance: an attractive force at long ranges (van der Waals force, or dispersion force) and a repulsive force at short ranges (the result of overlapping electron orbitals, referred to , but are typically subject to three types of forces, namely, conservative forces, dissipative dis·si·pate v. dis·si·pat·ed, dis·si·pat·ing, dis·si·pates v.tr. 1. To drive away; disperse. 2. forces, and a random force. All of the forces conserve momentum and mass. The conservative force is simply a central force, derivable from some potential. The dissipative force is proportional proportional values expressed as a proportion of the total number of values in a series. proportional dwarf the patient is a miniature without disproportionate reductions or enlargements of body parts. to the difference in velocity between particles and acts to slow down their relative motion. The dissipative force can be shown to produce a viscous viscous /vis·cous/ (vis´kus) sticky or gummy; having a high degree of viscosity. vis·cous adj. 1. Having relatively high resistance to flow. 2. Viscid. effect. The random force (usually based on a Gaussian random noise) helps maintain the temperature of the system while producing a viscous effect. It can be shown that, in order to maintain a well defined temperature by way of consistency with a fluctuation-dissipation theorem theorem, in mathematics and logic, statement in words or symbols that can be established by means of deductive logic; it differs from an axiom in that a proof is required for its acceptance. [3], coefficients describing the strength of the dissipative and random forces must be coupled. By mapping of the DPD equations of motion to the Fokker-Planck equation [4], it has been demonstrated that the DPD equations can recover hydrodynamic hy·dro·dy·nam·ic also hy·dro·dy·nam·i·cal adj. 1. Of or relating to hydrodynamics. 2. Of, relating to, or operated by the force of liquid in motion. behavior consistent with the Navier-Stokes equations The Navier-Stokes equations, named after Claude-Louis Navier and George Gabriel Stokes, describe the motion of fluid substances such as liquids and gases. These equations establish that changes in momentum in infinitesimal volumes of fluid are simply the sum of dissipative viscous . As in MD, the forces on each particle particle /par·ti·cle/ (pahr´ti-k'l) a tiny mass of material. Dane particle an intact hepatitis B viral particle. are computed in each time step. The particles are then moved and the forces recomputed. In DPD the interparticle interactions are chosen to allow for much larger time steps so that physical behavior, on time scales many orders of magnitude greater than that possible with MD, may be studied. The original DPD algorithm [2] used an Euler algorithm for updating the positions of the free particles In physics, a free particle is a particle that, in some sense, is not bound. In the classical case, this is represented with the particle not being influenced by any external force. (which represent "lumps" of fluids), and a leap frog frog, common name for an amphibian of the order Anura. Frogs are found all over the world, except in Antarctica. They require moisture and usually live in quiet freshwater or in the woods. algorithm for updating the positions of solid inclusions (rigid bodies). Our algorithm QDPD [5], for quarternion-based dissipative particle dynamics, is a modification of DPD that uses the velocity-Verlet algorithm of Groot and Warren [6] to update the positions of both the free particles and the solid inclusions. The velocity-Verlet algorithm for DPD [5] is chosen because it is less sensitive to variation in time step size than the Euler algorithm. The solid inclusion motion is determined from the quaternion-based scheme of Omelayan [7] (hence the Q in QDPD). QDPD in its present form is being used to study the steady-shear viscosity of a suspension of solid inclusions (such as ellipsoids) in a Newtonian fluid. The model consists of N particles moving in a continuum Continuum (pl. -tinua or -tinuums) can refer to:
tr.v. as·signed, as·sign·ing, as·signs 1. To set apart for a particular purpose; designate: assigned a day for the inspection. 2. a location in space so that they approximate the shape of the object [8]. The motion of these particles is then constrained con·strain tr.v. con·strained, con·strain·ing, con·strains 1. To compel by physical, moral, or circumstantial force; oblige: felt constrained to object. See Synonyms at force. 2. so that their relative positions never change. The total force and torque are determined from the DPD particle interactions and the rigid body moves according to the Euler equations
In fluid dynamics, the Euler equations govern the compressible, Inviscid flow. . As mentioned above, our simulations use a quaternion-based scheme developed by Omelayan and modified by Martys and Mountain [5] for a velocity-Verlet algorithm to integrate the equations of motion. Finally, we use a Lees-Edwards boundary condition [9] (pp 246-247) to produce a shearing shearing In textile manufacturing, the cutting of the raised nap of a pile fabric to a uniform height to enhance appearance. Shearing machines operate much like rotary lawn mowers, and the amount of shearing depends on the desired height of the nap or pile. effect akin to an applied strain at the boundaries. The basic idea is to compute To perform mathematical operations or general computer processing. For an explanation of "The 3 C's," or how the computer processes data, see computer. all of the forces on each particle (which accounts for the momenta change in the collision phase) during each time step, and then move the particles (propagation The transmission (spreading) of signals from one place to another. phase). The forces are shortrange and are a sum of contributions over pairs of particles. The interaction decays rapidly with separation, which means that only particles closer than some cutoff distance [r.sub.c] need be considered. Several methods are available for identifying the nearest neighbors See point sampling. of a particle, i.e, those within the cutoff distance. QDPD uses an implementation of the link-cell method of Quentrec et al. [10] described in Allen and Tildesley's book [9] (pp. 149-152). Here, the simulation box is partitioned par·ti·tion n. 1. a. The act or process of dividing something into parts. b. The state of being so divided. 2. a. into a number of cells. For example, see Fig. 1, which depicts a 2D system. To find the particles within the cutoff distance [r.sub.c] of the particle shown in the central cell, it is sufficient to only consider particles within the central cell and each of its eight nearest neighbor cells (where [r.sub.c] is [less than or equal to] the cell widths in X and Y, [I.sub.x] and [I.sub.y]). The use of Newton's third law Noun 1. Newton's third law - action and reaction are equal and opposite law of action and reaction, Newton's third law of motion, third law of motion law of motion, Newton's law, Newton's law of motion - one of three basic laws of classical mechanics makes it possible for us to only have to consider half of the nearest neighboring cells, which are cross-hatched (lines parallel to the right-diagonal in the cell) in the figure. Generalizing this to all particles in the system, a linked list of all the particles contained in each cell is constructed every timestep. Then, for each particle, the selection of all particles within the cutoff is achieved by looping over one half (considering Newton's third law) of all nearest neighbor cells, and considering only the particles within these cells. We show this schematically sche·mat·ic adj. Of, relating to, or in the form of a scheme or diagram. n. A structural or procedural diagram, especially of an electrical or mechanical system. in Fig. 2. [FIGURE 1 OMITTED] [FIGURE 2 OMITTED] The (forces calculation) search scheme involves an outer loop over all 25 link-cells. In this outer loop, each particle in a link-cell interacts with all particles within its link-cell that are within [r.sub.c] of the particle. Then there is an inner loop over four of the eight nearest neighbor link-cells, and each particle interacts with all of the particles within the chosen neighbor link-cells that are within [r.sub.c] of the particle. For example, particles in cell 13 interact with other particles in 13 plus particles in 17, 18, 19, and 14 that are within the cutoff distance of the chosen particle. Note that to account for the forces on particles in edge cells, periodic boundaries are used to have, for example, 25 interacting with the appropriate nearest neighbor periodic cells 4', 5', 1", and 21' (more on this later). The program for figuring out nearest neighbor cells is easy to set up. Introducing cell indices [I.sub.x] and [I.sub.y] for the 2D grid in Fig. 2, each cell's index in the 2D grid can be computed from ICELL([I.sub.x],[I.sub.y]) = 1+MOD (1) See modulo and magneto-optic disk. (2) (MODify or MODification) Refers to enhancements made by PC and gaming enthusiasts to their computer systems. "Modders" alter the standard desktop computer for looks, performance or both. ([I.sub.x]-1+[M.sub.x][M.sub.y],[M.sub.x])+MOD([I.sub.y]-1+[M.sub.x][M.sub.y],[M.sub.y])[M.sub.x], (1) where MOD is the function which returns the modulo A mathematical operation (modulus arithmetic) in which the result is the remainder of a division. Also known as the "remainder operator," it is used to solve a variety of problems. For example, the following code in the C language determines if a number is odd or even. of its arguments and [M.sub.x] and [M.sub.y] are the number of cells in X and Y ([I.sub.x] = {I,[M.sub.x]},[I.sub.y] = {1,[M.sub.y]}). For each cell, one-half of the nearest neighbor cells are given by ICELL([I.sub.x]+1,[I.sub.y])+ICELL([I.sub.x]+1,[I.sub.y]+1)+ICELL([I.sub.x],[I.sub.y]+1) + ICELL([I.sub.x]-1, [I.sub.y]+1), (2) which correctly gives the cell neighbors of 13 to be 17, 18, 19, and 14. Now we can explain the treatment of Newton's third law. A particle in cell 13 interacts with the particles in 8 neighboring neigh·bor n. 1. One who lives near or next to another. 2. A person, place, or thing adjacent to or located near another. 3. A fellow human. 4. Used as a form of familiar address. v. cells, but the algorithm only checks particles in 17, 18, 19, and 14. Interactions of particles in 13 with particles in 12, for example, are treated when particles in cell 12 are the focus of attention, and similarly for 7, 8, and 9. Note that this formula also gives the nearest neighbors of particles in cell 25 to be those in 4, 5, 1, and 21 (not the periodic cells 4', 5', 1", and 21'). This will be explained later. Because of their regular arrangement, the list of neighboring cells is fixed and may be precomputed once and for all at the beginning of the program (in subroutine A group of instructions that perform a specific task. A large subroutine might be called a "module" or "procedure." Subroutine is somewhat of a dated term, but it is still quite valid. MAPS). 2. Sequential Link-Cell Algorithm Incorporating a link-cell search into the velocity-Verlet algorithm gives, in outline, Read in initial data. Read in configurational data (solid inclusions). Set up the map to find neighboring cells (subroutine MAPS). Perform the QDPD cycle for each time step. Given the forces [f.sub.1i] acting on particles at time t, the fundamental QDPD cycle, repeated for as many timesteps as are in a simulation, is Compute the new particle positions from [r.sub.2i] = [r.sub.1i] + [v.sub.1i][DELTA]t + [f.sub.1i][DELTA][t.sup.2] /2. (3) Compute the midpoint mid·point n. 1. Mathematics The point of a line segment or curvilinear arc that divides it into two parts of the same length. 2. A position midway between two extremes. velocity (velocity at the midpoint of the time step) from [~.v.sub.i] = [v.sub.1i] + [f.sub.1i][DELTA]t/2. (4) Create the linked list (subroutines TOPMAP and LINKS). Calculate new forces [f.sub.2i]. Compute new velocities from the new forces [v.sub.2i] = [~.v.sub.i] + [f.sub.2i][DELTA]t/2. (5) And then the cycle begins again. In the basic QDPD (MD) cycle above, the [r.sub.2]s may belong to particles which have moved out of the simulation box, such as particles in the periodic image of cell 21 represented by the dashed box 21' in Fig. 2. This can be handled by introducing another set of coordinates, [r.sub.3], given by r3x(i) = r2x(i) - ANINT(r2x(i)/[L.sub.x])[L.sub.x] r3y(i) = r2y(i) - ANINT(r2y(i)/[L.sub.y])[L.sub.y] r3z(i) = r2z(i) - ANINT(r2z(i)/[L.sub.z])[L.sub.z], (6) where [L.sub.x], [L.sub.y], and [L.sub.z] are the simulation dimensions and ANINT is the function which returns the nearest whole integer integer: see number; number theory to its argument. The [r.sub.3] coordinates are used in creating linked lists of particles in cells prior to the force calculations. Consequently, particles which have moved into 21' will end up assigned to 21 which is where the formula for nearest neighbors expects to find them. Hence the [r.sub.3] coordinates make sure that particles are assigned to one of the cells in the QDPD simulation box (i.e., particles are kept within the QDPD simulation box running from (-[L.sub.x]/2, -[L.sub.y]/2,-[L.sub.z]/2) to ([L.sub.x]/2,[L.sub.y]/2,[L.sub.z]/2) in [r.sub.3] space). They also are consistent with the proper mapping of nearest neighbor cells given by the ICELL([I.sub.x], [I.sub.y]) formula. One other point has to do with calculating forces on particles > [r.sub.c] away (since the particles may have moved out of the simulation box). In calculating forces, [r.sub.2] coordinates are used, and these [r.sub.2] coordinates may be > [r.sub.c] away, a violation of the minimum image condition. To correct for this, the difference between particles in the forces calculation is [DELTA][r.sub.2x](ij) = r2x(i)-r2x(j)-ANINT[(r2x(i)-r2x(j)]/ [L.sub.x])[L.sub.x] [DELTA][r.sub.2y](ij) = r2y(i)-r2y(j)-ANINT[(r2y(i)-r2y(j)]/ [L.sub.y])[L.sub.y] [DELTA][r.sub.2z](ij) = r2z(i)-r2z(j)-ANINT[(r2z(i)-r2z(j)]/ [L.sub.z])[L.sub.z] (7) Consider Fig. 2 again. With the ANINT corrections above, particles in the cells 4, 5, 1, and 21 are within [r.sub.c] away from particles in cell 25. This is the way our sequential version of the program was written. Our parallel version of the program does this differently. In treating edge cells, a "ghost" layer of cells is added to the QDPD simulation box. The dashed cells in Fig. 2, the nearest neighbor periodic cells, are part of the "ghost" layer of cells. The formation of these "ghost" cells will be discussed in the next section. In the sequential version of the program, MAPS considers all cells in all layers except for the top layer (the topmost layer in Y in a 3D simulation), and computes only half of these cells, taking into account Newton's third law. Because QDPD in its present form is being used to study the steady-shear viscosity of a suspension of solid inclusions in a Newtonian fluid, there is a shear boundary condition at the topmost layer of the QDPD simulation box, implemented with the Lees-Edwards boundary conditions [9] (pp. 246-247). These boundary conditions simulate simulate - simulation a uniform shear in the XY plane (i.e, a constant velocity gradient gradient In mathematics, a differential operator applied to a three-dimensional vector-valued function to yield a vector whose three components are the partial derivatives of the function with respect to its three variables. The symbol for gradient is ∇. is set up in the Y direction and the actual shear occurs in the X direction). Figure 3 shows a time series of the motion of a single ellipsoidal inclusion subject to shear. Proceeding from left to right, the different colors (or greyscale levels) [12] correspond to the time sequence. The single ellipsoid rotation is a well known phenomenon seen in experiments called Jeffery's orbits. The shearing boundary conditions were obtained by applying a constant strain rate to the right at the top of the figure and to the left at the bottom of the figure. Figure 4 [9] (p. 246) demonstrates this situation in the context of the computer simulation. The central box in the figure is the QDPD simulation box (the entire box in Fig. 5, not just the one on the central processor 4). Boxes in the layer above are moving at a certain speed in the positive direction, and boxes in the layer below are moving at the same speed in the negative direction. To implement this shear boundary condition at the topmost layer (because of Newton's third law, we only have to treat the top), the top layer is tackled separately in subroutine TOPMAP in our sequential version of the program. Its purpose is to create the list of neighboring cells for the topmost layer of cells, taking into account the movement of the cells with respect to each other due to shear. TOPMAP is called every timestep in the simulation just before the force calculation, but only on the topmost processors. The periodic minimum image convention must also be modified to account for this shear. The [r.sub.3]s are modified to be [FIGURE 3 OMITTED] [FIGURE 4 OMITTED] [FIGURE 5 OMITTED] cory = ANINT (r2y(i)/[L.sub.y]) r3x(i) = r2x(i) - cory*strain r3x(i) = r3x(i) - ANINT(r2x(i)/[L.sub.x])[L.sub.x] r3y(i) = r2y(i) - ANINT(r2y(i)/[L.sub.y])[L.sub.y] r3z(i) = r2z(i) - ANINT(r2z(i)/[L.sub.z])[L.sub.z], (8) where the upper layer (BCD (Binary Coded Decimal) The storage of numbers in which each decimal digit is converted into binary and is stored in a single character or byte. For example, a 12-digit number would take 12 bytes. See binary numbers. in Fig. 4) is displaced displaced see displacement. relative to the central box by an amount strain. Similar corrections are made in forces. 3. Spatial Decomposition Theory QDPD was originally written in Fortran 77 as a serial program. A lot of the formalism of the sequential link-cell algorithm relies heavily on the Allen and Tildesley book [9] and computer routines (such as MAPS and TOPMAP) discussed in the book and available on the Web [13]. We have retained the names of the Allen and Tildesley routines in our program and in the discussion. Routines discussed later in the text, such as EXTVOL, LEBC LEBC Low End Business Center LEBC Lower Earley Baptist Church (UK) , and MOVPAR, are parallel routines and have no counterpart in Allen and Tildesley. To improve computational performance, a parallelization was done relatively quickly using a simplified version of the replicated data approach and the standard message passing interface (communications, protocol) Message Passing Interface - A de facto standard for communication among the nodes running a parallel program on a distributed memory system. MPI is a library of routines that can be called from Fortran and programs. library (MPI [1]), as described in Sims et. al. [14]. We reported speedups of as much as 17.5 times on 24 processors of a 32 processor shared memory (1) Using part of main memory to support a low-cost display circuit that does not have its own memory. See shared video memory. (2) The common memory in a symmetric multiprocessing system that is available to all CPUs. See SMP. 1. SGI (SGI, Sunnyvale, CA, www.sgi.com) A manufacturer of workstations and servers, founded in 1982 by Jim Clark. The company was founded as Silicon Graphics, Inc., but changed to its acronym in 1999. Origin 2000 (1). When doing a calculation on multiple processors, the total run time can be represented as the sum of computation Computation is a general term for any type of information processing that can be represented mathematically. This includes phenomena ranging from simple calculations to human thinking. (cpu) time and communication time, viz., t = [t.sub.cpu]+[t.sub.comm]. (9) In the replicated data approach, as P (number of processors) increases, [t.sub.cpu] goes down, but we still have to communicate the same amount of information (proportional to N (number of particles), so it doesn't scale). Also distributed memory machines often do not possess enough memory on a processing node to hold all of the data for a large job. When the goal is to simulate an extremely large system on a distributed-memory computer to allow for the larger total memory of the distributed-memory computer and also to take advantage of a larger number of processors, a different approach is needed. Since the link-cell algorithm we used in the sequential and replicated data approaches breaks the simulation space into domains, it seems natural to map this geometrical ge·o·met·ric also ge·o·met·ri·cal adj. 1. a. Of or relating to geometry and its methods and principles. b. Increasing or decreasing in a geometric progression. 2. , or domain, decomposition onto separate processors. Doing so is the essence of the parallel link-cell technique [11, 15] (2). By subdividing the physical volume among processors, most of the computation becomes local and the communication is minimized so there is, in principle, an N/P N/P No Problem N/P Not Provided N/P new password N/P No Password N/P No Pets N/P Notice Period N/P Not Payable N/P Nothing Posted scaling (N = number of particles, P = number of processors), an efficient approach for distributed-memory computers and networks of workstations. The basic idea is this: Split the total volume into P domains, where P is the number of processors. If we choose a 1D decomposition ("slices of bread"), then the pth processor is responsible for particles whose x-coordinates lie in the range (p-1)[L.sub.x]/P[less than or equal to]x<p[L.sub.x]/P. (10) Similar equations apply for 2D and 3D decompositions for simulation dimensions [L.sub.y] and [L.sub.z]. Whether the decomposition is 1D, 2D, or 3D depends on the number of processors. An algorithm due to Plimpton [17] is used to assign P processors to a 3D box so as to minimize the surface area (and hence, yield a good load balancing The fine tuning of a computer system, network or disk subsystem in order to more evenly distribute the data and/or processing across available resources. For example, in clustering, load balancing might distribute the incoming transactions evenly to all servers, or it might redirect them ). For P processors and a given simulation box of dimensions [L.sub.x], [L.sub.y], and [L.sub.z], the algorithm is the following. Loop through all factorizations of P into [P.sub.x], [P.sub.y], and [P.sub.z] processors, computing computing - computer the area of the resulting box, and pick the one with the minimum surface area. Of multiple equal surface areas (for example, [P.sub.x], [P.sub.y], [P.sub.z] = (4,2,2), (2,4,2), (2,2,4)), pick the one with [P.sub.x] [less than or equal to] [P.sub.y] [less than or equal to] [P.sub.z]. Each processor runs a link-cell program corresponding to a particular domain of the simulation box. For example, in Fig. 5 nine processors were used to divide the 2D simulation space into domains, each processor being assigned to one of the nine domains (here we assign each processor an index, where the indices start at 0). We also show the central processor's domain being subdivided into cells. To complete the force calculation on particles in cells at the interface between processors, each processor needs to know information about the particles in the adjacent cells, which now will be found on a neighboring processor. To handle this problem we construct an extra layer of cells on each processor at the interface between processors. At each timestep we communicate information across the interface between adjacent processors describing the particles in these edge cells (subroutine EXTVOL). The information that has to be passed by EXTVOL is the information needed for the forces calculations, which is, [r.sub.3i], [r.sub.2i], [~.v.sub.i], and the unique particle number The particle number, N, is the number of so called 'elementary particles' (or elementary constituents) in a thermodynamical system. The particle number is a fundamental parameter in thermodynamics and it is conjugate to the chemical potential. discussed below. For example, in Fig. 5 we show processors 3, 4, and 5 and we also show, with dashed lines, the cells in processors 3 and 5 which are adjacent to 4. Information about the particles in these dashed line cells is communicated to 4, making up "ghost" cells on 4. To complete the "extended volume" needed on processor 4 to compute the forces on all the particles it "owns", information is communicated (swapped) across the interface between adjacent processors in the Y direction as well. To account for the cross-hatched corner cells, the swap in Y includes information about not only particles that the processor owns but also information about "other" particles in "ghost" cells. So processor 7 sends information about particles in the dashed line cells as well as the cross-hatched cells (obtained from processors 6 and 8) to processor 4. At this point processor 4 has all the information it needs to calculate forces on all the particles it owns (processor 4 now has information about all the particles shown in the extended volume comprised of the domain of processor 4 plus the surrounding sur·round tr.v. sur·round·ed, sur·round·ing, sur·rounds 1. To extend on all sides of simultaneously; encircle. 2. To enclose or confine on all sides so as to bar escape or outside communication. n. dashed line ghost cells ghost cell n. 1. A dead cell in which the outline remains visible, but whose nucleus and cytoplasmic structures are not stainable. 2. A red blood cell after loss of its hemoglobin. ghost cell see shadow cell. ), and similarly all of the other processors have all the information they need. These exchanges of data can be achieved by one set of communications between the processors. A processor only has to communicate once with all of its neighbors, so each processor communicates with at most four other processors (six in 3D), rather than, say 64 in a 64 processor replicated data calculation. Now on each processor, form a link-cell list of all particles in the original volume plus the extended volume. Loop over the particles in the original volume, calculating the forces on them and their pair particle (for conservation of momentum). Care must be taken to add these pair particle forces on particles in the extended volume to the forces on the pair particles in the processor "owning" them, which necessitates an extra set of communications between processors (the reverse of the communication swaps setting up the "ghost" cells). This extra communication step is necessary in the QDPD method since the interparticle force calculation involves the use of a random number for thermal effects and momentum conservation requires that the same random number be used in the equal and opposite force calculation. Finally calculate the new positions of all particles and move the particles which have left a processor to their new home processor. If particles move into domains controlled by other processors, information about the particle (the particle's properties) must be moved to its new "home" processor. Again these exchanges of data can be achieved by one set of communications between the processors, and are implemented in subroutine MOVPAR. In this set of communications, all information about a particle needed for one time step must be communicated, not just the information needed for the forces calculation (the information communicated in EXTVOL as explained above). Our spatial decomposition program has the following added features. First, following Plimpton [17], we distinguish between "owned" particles and "other" particles, those particles that are on neighboring processors and are part of the extended volume on any given processor. For "other" particles, only the information needed to calculate forces is communicated to neighboring processors. Second, the QDPD technique is being applied to suspensions, so there are two types of particles, "free" particles and particles belonging to ellipsoids (the solid inclusions). A novel feature of this work is that we explicitly do not keep all particles belonging to the same ellipsoid on the same processor. Since the largest ellipsoid that might be built can consist of as much as 50% of all particles, that would be difficult if not impossible to handle without serious load-balancing implications. What we do is assign each particle a unique particle number when it is read in. Each processor has the list of ellipsoid definitions consisting of lists of particles defined by these unique particle numbers. Each processor computes solid inclusion properties for each particle it "owns," and these properties are globally summed (using MPI_REDUCE [1, 18] over all processors so that all processors have the same solid inclusion properties. Since there are only a small number of ellipsoids (relative to the number of particles), the amount of communication necessary for the global sums is small and the amount of extra memory is also relatively small. Hence it is an efficient technique. 4. Spatial Decomposition Program Details After various preliminaries, the program reads information about the simulation space and then calls DOMAIN to figure out the spatial (domain) decomposition. To determine which processors control adjacent domains we identify each processor uniquely by considering each processor in the network as a cell in a link-cell structure. We then use the link-cell algorithm to determine the addresses of a processor's neighbors. Particles are then mapped onto processors on the basis of their x, y, and z coordinates. For 3D, we denote de·note tr.v. de·not·ed, de·not·ing, de·notes 1. To mark; indicate: a frown that denoted increasing impatience. 2. the number of processors allocated in the X, Y, and Z dimensions by [P.sub.x], [P.sub.y], and [P.sub.z], respectively, so P = [P.sub.x][P.sub.y][P.sub.z], (11) For a particle at position [r.sub.i] = ([x.sub.i], [y.sub.i], [z.sub.i]) in a simulation box with sides of length [L.sub.x], [L.sub.y], and [L.sub.z], with 0 [less than or equal to] [x.sub.i] < [L.sub.x], 0 [less than or equal to] [y.sub.i] < [L.sub.x], 0 [less than or equal to] [z.sub.i] < [L.sub.z] the processor coordinates are given by [I.sub.[x.sub.i]] = INT ([x.sub.i][P.sub.x]/[L.sub.x]) [I.sub.[y.sub.i]] = INT ([y.sub.i][P.sub.y]/[L.sub.y]) [I.sub.[z.sub.i]] = INT ([z.sub.i][P.sub.z]/[L.sub.z]), (12) where INT is the function returning the integer part of the argument in brackets brackets: see punctuation. . The mapping from processor coordinates ([I.sub.[x.sub.i]],[I.sub.[y.sub.i]],[I.sub.[z.sub.i]]) to processor index is given by [I.sub.i] = [I.sub.[x.sub.i]] + [I.sub.[z.sub.i]][P.sub.x] + [I.sub.[y.sub.i]][P.sub.x][P.sub.z], (13) Coordinates of the center of each processor's simulation box can be calculated from rorigin (1) = [L.sub.x](([I.sub.[x.sub.i]] + 0.5) / [P.sub.x] - 0.5) rorigin (2) = [L.sub.y](([I.sub.[y.sub.i]] + 0.5) / [P.sub.y] - 0.5) rorigin (3) = [L.sub.z](([I.sub.[z.sub.i]] + 0.5) / [P.sub.z] - 0.5). (14) Particles are allocated to processors on the basis of [I.sub.i] at the start (subroutines INITPR (for particles) and INITSNEW (for ellipsoids)) and whenever particles are moved. While figuring out the domain decomposition, a processor's north (+y direction), south (-y direction), east (+x direction), west (-x direction), up (+z direction), and down (-z direction) neighboring processors are tabulated. The simulation box size for each processor is given by rprosl (1) = [L.sub.x]/[P.sub.x] rprosl (2) = [L.sub.y]/[P.sub.y] rprosl (3) = [L.sub.z]/[P.sub.z], (15) In the domain decomposition molecular dynamics cycle (subroutine CYCLE), we now have, on each processor, Compute the new particle positions from [r.sub.2i] = [r.sub.1i]+[v.sub.1i][DELTA]t+[f.sub.1i] [DELTA][t.sup.2]/2. (16) Compute the midpoint velocity (velocity at the midpoint of the time step) from [~.v.sub.i] = [v.sub.1i] + [f.sub.1i][DELTA]t / 2. (17) Calculate r3s to make sure particles remain in the QDPD box. Move particles (MOVPAR) to their new home processor based on r3s. Construct an extended volume consisting of owned cells plus ghost cells (EXTVOL) based on r3s. EXTVOL calls a subroutine (LEBC) to apply Lees-Edwards shear boundary conditions. Construct the link-cell list (LINKS) based on r3 coordinates. Calculate new forces (FORCES), including a call to THIRDLAW, which transfers pair forces back to their home processor and adds them to forces there. Compute new velocities from the new forces [v.sub.2i] = [~.v.sub.i] + [f.sub.2i][DELTA]t/2. (18) The way the Newton's third law forces are handled in spatial (domain) decomposition is the following. A table is kept of edge particles that are sent in all directions. Then after forces are calculated, THIRDLAW loops over just the "other" particles looking for Looking for In the context of general equities, this describing a buy interest in which a dealer is asked to offer stock, often involving a capital commitment. Antithesis of in touch with. force contributions that have to be sent back to the processor that "owns" the particle and added to the forces there. THIRDLAW then communicates these Newton's third law force additions back to the "home" processors of the "other" particles and adds them to the forces there. Some of these steps require additional explanation. In MOVPAR [r.sub.3i] coordinates are transformed, by subtracting the coordinates of the center of each processor's simulation box, so that they are in the range -rprosl(k)/2 [less than or equal to] r3 (k,i) - rorigin(k) < rprosl(k)/2. (19) where rprosl(k) is the size of that processor's domain in the k direction. Particles which don't meet this criterion have moved out of the processor and are sent to their new home processor. A subtle point is that this is relatively slow motion so we know that the move is to the nearest neighbor in the k dimension, the one in the negative or positive direction, depending on whether r3(k, i)-rorigin(k) < -rprosl(k) or r3(k, i)-rorigin(k) [greater than or equal to] rprosl(k). This is true for k = 2 or 3, but because of the shear boundary condition at the topmost layer, particles may have moved more than one processor away in X in a single time step. We handle this by finding the maximum number of swaps in X on each processor, then do a global MAX of the values of each processor to determine how many swaps to do. Next comes the formation of extended volumes using "ghost" cells in EXTVOL. To accomodate the "ghost" cells, the number of cells in each direction is increased by 2. So, for example, for a division of the central processor into 100 cells as in Fig. 5, the X and Y cell dimensions are 10. [M.sub.x] and [M.sub.y], the cell X and Y cell dimensions for this processor are 12 (10 + 2) to accomodate the left and right ghost cells. We use the following to define a cell index for particle i (ICEL ICEL International Committee on English in the Liturgy ICEL International Consortium for Experiential Learning ICEL International Committee for English in the Liturgy [L.sub.i]) ICEL[L.sub.i] = [I.sub.[x.sub.i]] + ([I.sub.[y.sub.i]] - 1) [M.sub.x] + ([I.sub.[z.sub.i]] - 1)[M.sub.x][M.sub.y] (20) where [I.sub.[x.sub.i]], [I.sub.[y.sub.i]], and [I.sub.[z.sub.i]] are now given by [I.sub.[x.sub.i]] = 1 + INT((r(1, i)[S.sub.x] + 0.5)[M.sub.x]) [I.sub.[y.sub.i]] = 1 + INT((r(2, i)[S.sub.y] + 0.5)[M.sub.y]) [I.sub.[z.sub.i]] = 1 + INT((r(3, i)[S.sub.z] + 0.5)[M.sub.z]). (21) [S.sub.x], [S.sub.y], and [S.sub.z] are scale factors whose purpose is to transform coordinates so that a processor's "own" particles in a domain will have values in the range 2 [less than or equal to] [I.sub.[x.sub.i]] [less than or equal to] [M.sub.x] -1 2 [less than or equal to] [I.sub.[y.sub.i]] [less than or equal to] [M.sub.y] -1 2 [less than or equal to] [I.sub.[z.sub.i]] [less than or equal to] [M.sub.z] -1. (22) In Fig. 6 we show the central processor from Fig. 5 again, with its "own" and "ghost" cells renumbered according to the above. Using these scale factors, it is straightforward to identify which particles need to be passed in all 4 (or 6) directions. For example, particles whose [I.sub.[x.sub.i]] value is 2 are left edge particles and need to be passed to the processor to the left; particles whose [I.sub.[x.sub.i]] value is 11 ([M.sub.x] - 1) are right edge particles and need to be passed to the processor on the right. It is important to note that particles in "ghost" cells are included in subsequent swaps, so for example particles whose [I.sub.[y.sub.i]] value is 2 are passed down, and that includes particles in the "ghost" cells with [I.sub.[y.sub.i]] = 1 and 12, and particles whose [I.sub.[y.sub.i]] value is 1 are passed up. This is the way the particles in corner cells are made available to adjacent processors. As processor 4 communicates information about particles in its edge cells with [I.sub.x] = 11 to processor 5, processor 5 in turn communicates information about particles in its left edge cells to processor 4, which become the right edge ghost cells on processor 4. So after swapping with processors to its left, right, north, and south, the complete "extended volume" exists on processor 4, and this can be followed by the link-cell list construction ([I.sub.x] = {1, 12}, [I.sub.y] = {1, 12}) and computation of forces (for particles owned by this processor, which are those in cells with [I.sub.x] = {2, 11}, [I.sub.y] = {2, 11}). [FIGURE 6 OMITTED] Now consider Fig. 5 again, and imagine calculating the forces using a single processor and the link-cell algorithm, and subdividing the simulation box into 30 cells in X and Y. The force calculation on particles in cells with [I.sub.x], [I.sub.y] = {11,20} in Fig. 5 would be calculated exactly the same way as the particles owned by processor 4 in Fig. 6, for which [I.sub.x], [I.sub.y] = {2,11}. This is the essence of the parallel link-cell method. Similar conditions apply for the other processors, except for processors containing cells on the edge of the simulation box, such as processor 8 in Fig. 7. Cells interior to the processor, for which [I.sub.x], [I.sub.y] are {2,10} are just like the cells on processor 4. At issue are the cells for which [I.sub.x] = 11 and those for which [I.sub.y] = 11, i.e, edge cells on the processor which are also edge cells for the whole simulation box (Fig. 5). But the right edge ghost cells ([I.sub.y] = 12) for processor 8 are [I.sub.x] = 1 cells for processor 6 and would be sent to processor 8 during the swap between these two processors (8 is the processor to the west (-x direction) of 6 and 6 is the processor to the east (+y direction) of 8). Similarly, processors 2 and 8 pair up to create the [I.sub.y] = 12 ghost cells on 8. The net result of this is that the force calculation on particles in the domain of processor 8 will be calculated exactly the same way as the force on particles in the cells with [I.sub.x] = {21,30}, [I.sub.y] = {21,30} in a sequential simulation of the whole box with 30 cells in X and Y. Similar conditions pertain to pertain to verb relate to, concern, refer to, regard, be part of, belong to, apply to, bear on, befit, be relevant to, be appropriate to, appertain to other processors containing cells on the edge of the simulation box. [FIGURE 7 OMITTED] One point that was skipped in the above discussion is the treatment of the shear boundary conditions. In Fig. 8 we show the Fig. 5 simulation box again, and three boxes above the simulation box, moving to the right, as well as three boxes below the simulation box, moving to the left. 0', 1', and 2' are images of 0, 1, and 2 which have moved to the right because of the shear. 6', 7', and 8' have moved left. In Fig. 9 we redraw To redisplay an image on screen whether text or graphics. The concept is that the first time elements are displayed, they are "drawn," and if something is changed, they are "redrawn." Applications often have a Refresh command that redraws the screen. Fig. 8, showing the sheared upper boundary and the extended volume we have to build prior to computing forces. Cells that must be considered for edge cells (2,31) and (31,31) are shown with arrows. Note that because of Newton's third law, the extended volume we need includes left, right, and up layers, but not down ([I.sub.y] = 1). Also care must be taken to include the shear shown in the figure. Subroutine EXTVOL handles this by forming the Y "ghost" layer before X (for 3D, the order is Z, Y, X). The [I.sub.y] = 32 layer is formed by processors 0, 1, and 2 sending their [I.sub.y] = 2 cells to processors 6, 7, and 8 respectively, and adding the simulator (1) Software that enables the execution of an application written for a different computer environment. Same as emulator. (2) Software that models the interactions of hypothetical or real-world objects or business processes. box distance in Y. In addition, movement to the right coming from the shear is computed from [FIGURE 8 OMITTED] [FIGURE 9 OMITTED] r3(1,k) = r3(1,k) + strain10 - ANINT(tempx/rmax(1))rmax(1) (23) where rmax(1) is the simulation box dimension in X and tempx = r3(1,k) + strain10 strain10 = rmax(1) * strain. (24) Now subroutine LEBC is called to relocate re·lo·cate v. re·lo·cat·ed, re·lo·cat·ing, re·lo·cates v.tr. To move to or establish in a new place: relocated the business. v.intr. particle properties to the processor that needs the information. This is done using the same technique as in MOVPAR, but care must be taken to keep track of the relocations so they can be reversed in the THIRDLAW transfer of forces back to their home processor. With these maneuvers For the military usage, see . "Maneuvers" is the 27th episode of , and the eleventh episode in the second season. Plot After Voyager detects a Federation probe, the Kazon Nistrim attack and steal some transporter technology. , the Lees-edwards boundary condition is accomplished in our parallel program. Basically the program implements particles leaving at the bottom of the simulation box and entering at the top "ghost" layer (the mirror image) but with its X coordinate shifted to account for the strain. 5. Results and Discussion Figure 10 shows the performance of our codes on two distributed memory architectures. In the figure we plot normalized processing time, which is the ratio of the time to complete a benchmark run on multiple processors divided by the time to compute a benchmark run on a single processor. [FIGURE 10 OMITTED] For the replicated data version of our code, the best we could do was a factor of 4.3 improvement on 16 processors on a Linux cluster with Myrinet. In comparison, the spatial decomposition version of the code, running on the same Linux cluster showed a greatly enhanced performance (a factor of 10.5 on 16 processors). The best results, for the spatial decomposition version, show a speed up of a factor of 24 on 27 200MHz (MegaHertZ) One million cycles per second. It is used to measure the transmission speed of electronic devices, including channels, buses and the computer's internal clock. A one-megahertz clock (1 MHz) means some number of bits (16, 32, 64, etc. Power3 processors on an IBM SP (IBM Scalable POWER) A family of massively parallel (MPP) computer systems from IBM based on its RS/6000 (pSeries) models that incorporate various POWER and PowerPC CPUs. First introduced in 1993, SP configurations support from two to 512 processors. 2, a distributed memory cluster, but with a high-speed interconnect (1) To attach one device to another. (2) A physical port (plug, socket) or wireless port (transmitter, receiver) used to attach one device to another. which allows it to approach the scalability of a shared memory machine in many cases. Our spatial decomposition code has proven effective in a shared memory environment [14] as well, where the speedups are a factor of 29 on 32 processors of an SGI Origin 3000 system and a factor of 50 on 64 processors of the same system. In contrast, for the replicated data parallelization, speedups are a factor of 17.5 on 24 processors of an SGI Origin 3000 [14]. Clearly, communication costs quickly become prohibitive pro·hib·i·tive also pro·hib·i·to·ry adj. 1. Prohibiting; forbidding: took prohibitive measures. 2. for replicated data parallelizations on distributed memory architectures. Scaling to a very large number of processors is poor even in the shared memory environment, and it makes the replicated data approach almost unusable on distributed memory machines including those with high-speed interconnects like the IBM SP2 cluster. 6. Summary In adopting a spatial decomposition approach, we found a significant improvment in performance of our codes despite the additional complications of communicating the random forces (3), implementation of the Lees-Edwards boundary condition, and accounting for objects that can extend over many processor domains. Clearly, the main bottleneck A lessening of throughput. It often refers to networks that are overloaded, which is caused by the inability of the hardware and transmission lines to support the traffic. It can also refer to a mismatch inside the computer where slower-speed peripheral buses and devices prevent the CPU of such an approach is the message passing between processors. As such technologies improve, we expect corresponding improvements in the computional performance of our algorithms. Speedups like this on parallel architecture computers also allow us to systematically explore regions of parameter space In generative art people talk about parameter space as the set of possible parameters for a generative system. In statistics one can study the distribution of a random variable. Several models exist, the most common one being the normal distribution (or Gaussian distribution). (e.g., different solid fractions, broader particle size Particle size, also called grain size, refers to the diameter of individual grains of sediment, or the lithified particles in clastic rocks. The term may also be applied to other granular materials. and shape distributions and other boundary conditions) that would be prohibitive on single processor computers. We also note for the record that this technique has proven effective in a shared memory environment [14] where the speedups were a factor of 29 on 32 processors of an SGI Origin 3000 system and a factor of 50 on 64 processors. Acknowledgments We would like to thank John G. Hagedorn for useful comments and programming support, Robert B. Bohn and N. Alan Heckert for graphics support, and Hsin Fang, Don Koss, Chris Schanzle, and Carl Spangler for systems support. Accepted: March 19, 2004 Available online: http://www.nist.gov/jres (1) Certain commercial equipment, instruments, or materials are identified in this paper to foster understanding. Such identification does not imply endorsement by the National Institute of Standards and Technology National Institute of Standards and Technology, governmental agency within the U.S. Dept. of Commerce with the mission of "working with industry to develop and apply technology, measurements, and standards" in the national interest. , nor does it imply that the materials or equipment identified are necessarily the best available for the purpose. (2) See Plimpton [16] for excellent discussions of all fast parallel algorithms. (3) The random force in the DPD formalism of particle i on particle j has to be equal and opposite of the force of particle j on particle i. 7. References [1] P. J. Hoogerbrugge and J. M. V. A. Koelman, Simulating Microscopic microscopic /mi·cro·scop·ic/ (mi?kro-skop´ik) 1. of extremely small size; visible only by the aid of the microscope. 2. pertaining or relating to a microscope or to microscopy. Hydrodynamic Phenomena with Dissipative Particle Dynamics, Europhys. Lett. 19 (1), 155 (1992). [2] P. Espanol and P. Warren, Statistical mechanics statistical mechanics, quantitative study of systems consisting of a large number of interacting elements, such as the atoms or molecules of a solid, liquid, or gas, or the individual quanta of light (see photon) making up electromagnetic radiation. of dissipative particle dynamics, Europhys. Lett. 30, 191 (1995). [3] C. Marsh, G. Backx, and M. H. Ernst, The Fokker-Planck-Boltzmann equation for dissipative particle dynamics, Europhys. Lett. 38, 441 (1997). [4] N. S. Martys and R. D. Mountain, Velocity Verlet algorithm for dissipative-particle-based models of suspensions, Phys. Rev. E 59 (3), 3733 (1999). [5] L. Verlet, Computer "experiments" on classical fluids. I. thermodynamical Adj. 1. thermodynamical - of or concerned with thermodynamics; "the thermodynamic limit" thermodynamic properties of lennard-jones molecules, Phys. Rev. 165, 201 (1967). [6] I. Omelyan, On the numerical integration In numerical analysis, numerical integration constitutes a broad family of algorithms for calculating the numerical value of a definite integral, and by extension, the term is also sometimes used to describe the numerical solution of differential equations. of motion for rigid polyatomics: the modified quaterion approach, Computer Phys. 12, 97 (1998). [7] J. M. V. A. Koelman and P. J. Hoogerbrugge, Dynamic Simulation Dynamic Simulation is similar to a physics engine, the technology used in many powerful computer graphics software programs, like 3ds Max, Maya, Lightwave, and many others to simulate physical characteristics. of hard sphere suspensions under steady shear, Europhys. Lett. 21 (1), 363 (1993). [8] M. P. Allen and D. J. Tildesley, Computer simulation of liquids, Clarendon Press, Oxford (1987). [9] B. Quentrec and C. Brot, New methods for searching for neighbours This article is about an Australian soap opera. For other articles with similar names, see Neighbours (disambiguation). Neighbours is a long-running Australian soap opera, which began its run in March 1985. in molecular dynamics computations, J. Comput. Phys. 13, 430 (1973). [10] M. Pinches, D. Tildesley, and W. Smith, Large scale molecular dynamics on parallel computers using the link-cell algorithm, Mol. Simul simul /sim·ul/ (sim´ul) [L.] at the same time as. . 6, 51 (1991). [11] J. S. Sims, J. G. Hagedorn, P. M. Ketcham, S. G. Satterfield, T. J. Griffin, W. L. George Walter Lionel George (1882 – 1926) was an English writer, born and brought up in Paris, France. He was known for novels and writings on feminism. Works
NISTIR National Institute of Standards and Technology Internal Report 6709 (2001) p. 1. [12] Collaborative Computational Projects (online), http://www.dl.ac.uk/CCP, Accessed March 2004. [13] Message Passing Interface Forum, MPI: A messagepassing interface standard, Int. J. Supercomput. Appl. 8 (3/4), 159 (1994). [14] J. S. Sims, J. G. Hagedorn, P. M. Ketcham, S. G. Satterfield, T. J. Griffin, W. L. George, H. A. Fowler, B. A. am Ende, H. K. Hung, R. B. Bohn, J. E. Koontz, N. S. Martys, C. E. Bouldin, J. A. Warren, D. L. Feder, C. W. Clark, B. J. Filla, and J. E. Devaney, Accelerating scientific discovery through computation and visualization, J. Res. Natl. Inst. Stand. Technol. 105, 875 (2000). [15] W. Smith, A replicated-data molecular dynamics strategy for the parallel Ewald sum, Comput. Phys. Commun. 67, 392 (1992). [16] S. J. Plimpton, Fast Parallel Algorithms for Short-Range Molecular Dynamics, J. Comput. Phys. 117, 1 (1995). [17] S. J. Plimpton, R. Pollock, and M. Stevens. Particle-Mesh Ewald and rRESPA for Parallel Molecular Dynamics Simulations, Proceedings of the Eighth SIAM Conference on Parallel Processing parallel processing, the concurrent or simultaneous execution of two or more parts of a single computer program, at speeds far exceeding those of a conventional computer. for Scientific Computing, March 1997. [18] W. Gropp, E. Lusk, and A. Skjellum, Using MPI (2nd edition), The MIT MIT - Massachusetts Institute of Technology Press, Cambridge, Mass. (1999). James S. Sims and Nicos Martys National Institute of Standards and Technology, Gaithersburg, MD 20899-8911 james.sims@nist.gov nicos.martys@nist.gov About the authors: James S. Sims is a computational scientist The term computational scientist is used to describe someone skilled in scientific computing. This person is usually a scientist, an engineer or an applied mathematician who applies high performance computers in different ways to advance the state-of-the-art in their respective in the Scientific Applications and Visualization Group of the NIST Information Technology Laboratory. Nicos S. Martys is a physicist in the Materials and Structure Division of the NIST Building and Fire Research Laboratory. The National Institute of Standards and Technology is an agency of the Technology Administration, U.S. Department of Commerce. |
|
||||||||||||||||

Printer friendly
Cite/link
Email
Feedback
Reader Opinion