The solver has been applied to a number of complex large-scale problemsby groups at NASA Langley,industry, and academia.(See the Applications section of this manual.)Internally, the software has been used to study airframe noise,space transportation vehicles, flow control devices usingsynthetic jets, the design of wind tunnel and flight experiments,and so forth.Boeing, Lockheed, Cessna, New Piper, and others have used the tools forapplications such as high-lift, cruise performance, and studies ofrevolutionary concepts.The software has also been used for military applications, large-scalecomputer science research at national labs, as well as algorithmicstudies performed at universities around the country.For example, researchers at Georgia Tech have been usingFUN3D as their base for rotorcraft work.
Hypersonic And High Temperature Gas Dynamics Solution Manual
Download
https://t.co/mq002AzFu4
This manual describes the installation and execution of FUN3D version 14.0.2,including optional dependent packages. FUN3D is a suite of computationalfluid dynamics simulation and design tools that uses mixed-element unstructuredgrids in a large number of formats, including structured multiblock andoverset grid systems. A discretely-exact adjoint solver may be used for formaldesign optimization, error estimation, and mesh adaptation. FUN3D alsooffers a reacting, real-gas capability and provides GPU acceleration of manycommon simulation options.
This manual describes the installation and execution of FUN3D version 14.0.1,including optional dependent packages. FUN3D is a suite of computationalfluid dynamics simulation and design tools that uses mixed-element unstructuredgrids in a large number of formats, including structured multiblock andoverset grid systems. A discretely-exact adjoint solver may be used for formaldesign optimization, error estimation, and mesh adaptation. FUN3D alsooffers a reacting, real-gas capability and provides GPU acceleration of manycommon simulation options.
This manual describes the installation and execution of FUN3D version 14.0,including optional dependent packages. FUN3D is a suite of computationalfluid dynamics simulation and design tools that uses mixed-element unstructured grids in a large number of formats, including structured multiblock andoverset grid systems. A discretely-exact adjoint solver may be used for formal design optimization, error estimation, and mesh adaptation. FUN3D alsooffers a reacting, real-gas capability and provides GPU acceleration of manycommon simulation options.
We describe a systematic approach for rendering time-varying simulation data produced by exa-scale simulations, using GPU workstations. The data sets we focus on use adaptive mesh refinement (AMR) to overcome memory bandwidth limitations by representing interesting regions in space with high detail. Particularly, our focus is on data sets where the AMR hierarchy is fixed and does not change over time. Our study is motivated by the NASA Exajet, a large computational fluid dynamics simulation of a civilian cargo aircraft that consists of 423 simulation time steps, each storing 2.5 GB of data per scalar field, amounting to a total of 4 TB. We present strategies for rendering this time series data set with smooth animation and at interactive rates using current generation GPUs. We start with an unoptimized baseline and step by step extend that to support fast streaming updates. Our approach demonstrates how to push current visualization workstations and modern visualization APIs to their limits to achieve interactive visualization of exa-scale time series data sets.
Computational performance of the FUN3Dunstructured-grid computational fluid dynamics (CFD)application on GPUs is highly dependent on the efficiency offloating-point atomic updates needed to support the irregularcell-, edge-, and node-based data access patterns in massivelyparallel GPU environments. We examine several optimizationmethods to improve GPU efficiency of performance-criticalkernels that are dominated by atomic update costs on NVIDIAV100/A100 and AMD CDNA MI100 GPUs. Optimization onthe AMD MI100 GPU was of primary interest since similarhardware will be used in the upcoming Frontier supercomputer.Techniques combining register shuffling and on-chip sharedmemory were used to transpose and/or aggregate resultsamongst collaborating GPU threads before atomically updatingglobal memory. These techniques, along with algorithmicoptimizations to reduce the update frequency, reduced therun-time of select kernels on the MI100 GPU by a factor ofbetween 2.5 and 6.0 over atomically updating global memorydirectly. Performance impact on the NVIDIA GPUs was mixedwith the performance of the V100 often degraded when usingregister-based aggregation/transposition techniques while theA100 generally benefited from these methods, though to a lesserextent than measured on the MI100 GPU. Overall, both V100and A100 GPUs outperformed the MI100 GPU on kernelsdominated by double-precision atomic updates; however, thetechniques demonstrated here reduced the performance gap andimproved the MI100 performance.
A high-fidelity multidisciplinary analysis and gradient-based optimization tool for rotorcraft aero-acoustics is presented. Tightly coupled discipline models include physics-based computational fluid dynamics, rotorcraft comprehensive analysis, and noise prediction and propagation. A discretely consistent adjoint methodology accounts for sensitivities of unsteady flows and unstructured, dynamically deforming, overset grids. The sensitivities of structural responses to blade aerodynamic loads are computed using a complex-variable approach. Sensitivities of acoustic metrics are computed by chain-rule differentiation. Interfaces are developed for interactions between the discipline models for rotorcraft aeroacoustic analysis and the integrated sensitivity analysis. The multidisciplinary sensitivity analysis is verified through a complex-variable approach. To verify functionality of the multidisciplinary analysis and optimization tool, an optimization problem for a 40% Mach-scaled HART-II rotor-and-fuselage configuration is crafted with the objective of reducing thickness noise subject to aerodynamic and geometric constraints. The optimized configuration achieves a noticeable noise reduction, satisfies all required constraints, and produces thinner blades as expected. Computational cost of the optimization cycle is assessed in a high-performance computing environment and found to be acceptable for design of rotorcraft in general level-flight conditions.
This paper presents a novel, efficient, conservative, edge-based method for evaluation of meanflow viscous fluxes and turbulence-model diffusion terms of the Reynolds-averaged Navier-Stokes equations on tetrahedral grids. The new method is implemented in a practical, node-centered, finite-volume computational fluid dynamics solver. The baseline finite-volume scheme that is equivalent to a second-order accurate finite-element Galerkin approximation of viscous stresses is reformulated. The order of operations to compute the cell-based Green-Gauss gradients is changed to combine the operations by edge, which leads to an equivalent formulation on tetrahedral grids, improves efficiency, and preserves the compact discretization stencil based on the nearest neighbors. The computational results presented in this paper verify the implementation of this edge-based method by comparing its accuracy and iterative convergence with those of the well verified and validated baseline formulation. Efficiency gains for residual and Jacobian evaluations result in significant reduction of time to solution. This novel edge-based formulation on tetrahedra can be seamlessly combined with the baseline formulation on cells of other types for computing solutions on mixed-element grids.
This paper presents a hierarchical adaptive nonlinear iteration method (HANIM) implemented in the NASA computational fluid dynamics code, FUN3D, to improve robustness and computational efficiency. In contrast to the legacy FUN3D iterative solver that relies on an approximate Jacobian, a simple multicolor Gauss-Seidel point-implicit iteration scheme, and linear Courant-Friedrichs-Lewy number (CFL) ramping, HANIM is based upon a hierarchy of modules including preconditioner, generalized conjugate residual, realizability check, nonlinear control, and CFL adaption modules. HANIM performance is systematically compared with the performance of the legacy solver of FUN3D and a baseline solver based on a preconditioner alone. Iterative solutions are compared for three benchmark cases: a subsonic separated flow around a hemisphere cylinder, a supersonic flow through a long duct, and a subsonic flow over the NASA wing-fuselage juncture model. Two Reynolds-averaged Navier-Stokes turbulence models are used in these computations, namely, the negative variant of the linear one-equation Spalart-Allmaras model and its nonlinear extension based on quadratic constitutive relations.
A linear solver algorithm used by a large-scale unstructured-grid computational fluid dynamics application is examined for a broad range of familiar and emerging architectures. Efficient implementationof a linear solver is challenging on recent CPUs offering vector architectures. Vector loads and stores are essential to effectively utilize available memory bandwidth on CPUs, and maintaining performance across different CPUs can be difficult in the face of varying vector lengths offeredby each. A similar challenge occurs on GPU architectures, where it is essential to have coalesced memory accesses to utilize memory bandwidtheffectively. In this work, we demonstrate that restructuring a computation, and possibly data layout, with regard to architecture is essential toachieve optimal performance by establishing a performance benchmark for each target architecture in a low level language such as vector intrinsics or CUDA. In doing so, we demonstrate how a linear solver kernel can be mapped to Intel Xeon and Xeon Phi, Marvell ThunderX2,NEC SX-Aurora TSUBASA Vector Engine, and NVIDIA and AMD GPUs. We further demonstrate that the required code restructuringcan be achieved in higher level programming environments such as OpenACC, OCCA, and Intel OneAPI/SYCL, and that each generally results in optimal performance on the target architecture. Relative performance metrics for all implementations are shown, and subjectiveratings for ease of implementation and optimization are suggested.
0aad45d008