|
|
@@ -4,12 +4,6 @@ FindCUDA
|
|
|
|
|
|
.. deprecated:: 3.10
|
|
|
|
|
|
- Superseded by first-class support for the CUDA language in CMake.
|
|
|
- Superseded by the :module:`FindCUDAToolkit` for CUDA toolkit libraries.
|
|
|
-
|
|
|
-Replacement
|
|
|
-^^^^^^^^^^^
|
|
|
-
|
|
|
It is no longer necessary to use this module or call ``find_package(CUDA)``
|
|
|
for compiling CUDA code. Instead, list ``CUDA`` among the languages named
|
|
|
in the top-level call to the :command:`project` command, or call the
|
|
|
@@ -17,9 +11,10 @@ in the top-level call to the :command:`project` command, or call the
|
|
|
Then one can add CUDA (``.cu``) sources to programs directly
|
|
|
in calls to :command:`add_library` and :command:`add_executable`.
|
|
|
|
|
|
-To find and use the CUDA toolkit libraries the :module:`FindCUDAToolkit`
|
|
|
-module has superseded this module. It works whether or not the ``CUDA``
|
|
|
-language is enabled.
|
|
|
+.. versionadded:: 3.17
|
|
|
+ To find and use the CUDA toolkit libraries the :module:`FindCUDAToolkit`
|
|
|
+ module has superseded this module. It works whether or not the ``CUDA``
|
|
|
+ language is enabled.
|
|
|
|
|
|
Documentation of Deprecated Usage
|
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
@@ -30,6 +25,9 @@ This script locates the NVIDIA CUDA C tools. It should work on Linux,
|
|
|
Windows, and macOS and should be reasonably up to date with CUDA C
|
|
|
releases.
|
|
|
|
|
|
+.. versionadded:: 3.19
|
|
|
+ QNX support.
|
|
|
+
|
|
|
This script makes use of the standard :command:`find_package` arguments of
|
|
|
``<VERSION>``, ``REQUIRED`` and ``QUIET``. ``CUDA_FOUND`` will report if an
|
|
|
acceptable version of CUDA was found.
|
|
|
@@ -50,342 +48,476 @@ location. In newer versions of the toolkit the CUDA library is
|
|
|
included with the graphics driver -- be sure that the driver version
|
|
|
matches what is needed by the CUDA runtime version.
|
|
|
|
|
|
+Input Variables
|
|
|
+"""""""""""""""
|
|
|
+
|
|
|
The following variables affect the behavior of the macros in the
|
|
|
script (in alphabetical order). Note that any of these flags can be
|
|
|
changed multiple times in the same directory before calling
|
|
|
-``CUDA_ADD_EXECUTABLE``, ``CUDA_ADD_LIBRARY``, ``CUDA_COMPILE``,
|
|
|
-``CUDA_COMPILE_PTX``, ``CUDA_COMPILE_FATBIN``, ``CUDA_COMPILE_CUBIN``
|
|
|
-or ``CUDA_WRAP_SRCS``::
|
|
|
-
|
|
|
- CUDA_64_BIT_DEVICE_CODE (Default matches host bit size)
|
|
|
- -- Set to ON to compile for 64 bit device code, OFF for 32 bit device code.
|
|
|
- Note that making this different from the host code when generating object
|
|
|
- or C files from CUDA code just won't work, because size_t gets defined by
|
|
|
- nvcc in the generated source. If you compile to PTX and then load the
|
|
|
- file yourself, you can mix bit sizes between device and host.
|
|
|
-
|
|
|
- CUDA_ATTACH_VS_BUILD_RULE_TO_CUDA_FILE (Default ON)
|
|
|
- -- Set to ON if you want the custom build rule to be attached to the source
|
|
|
- file in Visual Studio. Turn OFF if you add the same cuda file to multiple
|
|
|
- targets.
|
|
|
-
|
|
|
- This allows the user to build the target from the CUDA file; however, bad
|
|
|
- things can happen if the CUDA source file is added to multiple targets.
|
|
|
- When performing parallel builds it is possible for the custom build
|
|
|
- command to be run more than once and in parallel causing cryptic build
|
|
|
- errors. VS runs the rules for every source file in the target, and a
|
|
|
- source can have only one rule no matter how many projects it is added to.
|
|
|
- When the rule is run from multiple targets race conditions can occur on
|
|
|
- the generated file. Eventually everything will get built, but if the user
|
|
|
- is unaware of this behavior, there may be confusion. It would be nice if
|
|
|
- this script could detect the reuse of source files across multiple targets
|
|
|
- and turn the option off for the user, but no good solution could be found.
|
|
|
-
|
|
|
- CUDA_BUILD_CUBIN (Default OFF)
|
|
|
- -- Set to ON to enable and extra compilation pass with the -cubin option in
|
|
|
- Device mode. The output is parsed and register, shared memory usage is
|
|
|
- printed during build.
|
|
|
-
|
|
|
- CUDA_BUILD_EMULATION (Default OFF for device mode)
|
|
|
- -- Set to ON for Emulation mode. -D_DEVICEEMU is defined for CUDA C files
|
|
|
- when CUDA_BUILD_EMULATION is TRUE.
|
|
|
-
|
|
|
- CUDA_LINK_LIBRARIES_KEYWORD (Default "")
|
|
|
- -- The <PRIVATE|PUBLIC|INTERFACE> keyword to use for internal
|
|
|
- target_link_libraries calls. The default is to use no keyword which
|
|
|
- uses the old "plain" form of target_link_libraries. Note that is matters
|
|
|
- because whatever is used inside the FindCUDA module must also be used
|
|
|
- outside - the two forms of target_link_libraries cannot be mixed.
|
|
|
-
|
|
|
- CUDA_GENERATED_OUTPUT_DIR (Default CMAKE_CURRENT_BINARY_DIR)
|
|
|
- -- Set to the path you wish to have the generated files placed. If it is
|
|
|
- blank output files will be placed in CMAKE_CURRENT_BINARY_DIR.
|
|
|
- Intermediate files will always be placed in
|
|
|
- CMAKE_CURRENT_BINARY_DIR/CMakeFiles.
|
|
|
-
|
|
|
- CUDA_HOST_COMPILATION_CPP (Default ON)
|
|
|
- -- Set to OFF for C compilation of host code.
|
|
|
-
|
|
|
- CUDA_HOST_COMPILER (Default CMAKE_C_COMPILER)
|
|
|
- -- Set the host compiler to be used by nvcc. Ignored if -ccbin or
|
|
|
- --compiler-bindir is already present in the CUDA_NVCC_FLAGS or
|
|
|
- CUDA_NVCC_FLAGS_<CONFIG> variables. For Visual Studio targets,
|
|
|
- the host compiler is constructed with one or more visual studio macros
|
|
|
- such as $(VCInstallDir), that expands out to the path when
|
|
|
- the command is run from within VS.
|
|
|
- If the CUDAHOSTCXX environment variable is set it will
|
|
|
- be used as the default.
|
|
|
+``cuda_add_executable()``, ``cuda_add_library()``, ``cuda_compile()``,
|
|
|
+``cuda_compile_ptx()``, ``cuda_compile_fatbin()``, ``cuda_compile_cubin()``
|
|
|
+or ``cuda_wrap_srcs()``:
|
|
|
+
|
|
|
+``CUDA_64_BIT_DEVICE_CODE`` (Default: host bit size)
|
|
|
+ Set to ``ON`` to compile for 64 bit device code, OFF for 32 bit device code.
|
|
|
+ Note that making this different from the host code when generating object
|
|
|
+ or C files from CUDA code just won't work, because size_t gets defined by
|
|
|
+ nvcc in the generated source. If you compile to PTX and then load the
|
|
|
+ file yourself, you can mix bit sizes between device and host.
|
|
|
+
|
|
|
+``CUDA_ATTACH_VS_BUILD_RULE_TO_CUDA_FILE`` (Default: ``ON``)
|
|
|
+ Set to ``ON`` if you want the custom build rule to be attached to the source
|
|
|
+ file in Visual Studio. Turn OFF if you add the same cuda file to multiple
|
|
|
+ targets.
|
|
|
+
|
|
|
+ This allows the user to build the target from the CUDA file; however, bad
|
|
|
+ things can happen if the CUDA source file is added to multiple targets.
|
|
|
+ When performing parallel builds it is possible for the custom build
|
|
|
+ command to be run more than once and in parallel causing cryptic build
|
|
|
+ errors. VS runs the rules for every source file in the target, and a
|
|
|
+ source can have only one rule no matter how many projects it is added to.
|
|
|
+ When the rule is run from multiple targets race conditions can occur on
|
|
|
+ the generated file. Eventually everything will get built, but if the user
|
|
|
+ is unaware of this behavior, there may be confusion. It would be nice if
|
|
|
+ this script could detect the reuse of source files across multiple targets
|
|
|
+ and turn the option off for the user, but no good solution could be found.
|
|
|
+
|
|
|
+``CUDA_BUILD_CUBIN`` (Default: ``OFF``)
|
|
|
+ Set to ``ON`` to enable and extra compilation pass with the ``-cubin`` option in
|
|
|
+ Device mode. The output is parsed and register, shared memory usage is
|
|
|
+ printed during build.
|
|
|
+
|
|
|
+``CUDA_BUILD_EMULATION`` (Default: ``OFF`` for device mode)
|
|
|
+ Set to ``ON`` for Emulation mode. ``-D_DEVICEEMU`` is defined for CUDA C files
|
|
|
+ when ``CUDA_BUILD_EMULATION`` is ``TRUE``.
|
|
|
+
|
|
|
+``CUDA_LINK_LIBRARIES_KEYWORD`` (Default: ``""``)
|
|
|
+ .. versionadded:: 3.9
|
|
|
+
|
|
|
+ The ``<PRIVATE|PUBLIC|INTERFACE>`` keyword to use for internal
|
|
|
+ :command:`target_link_libraries` calls. The default is to use no keyword which
|
|
|
+ uses the old "plain" form of :command:`target_link_libraries`. Note that is matters
|
|
|
+ because whatever is used inside the ``FindCUDA`` module must also be used
|
|
|
+ outside - the two forms of :command:`target_link_libraries` cannot be mixed.
|
|
|
+
|
|
|
+``CUDA_GENERATED_OUTPUT_DIR`` (Default: :variable:`CMAKE_CURRENT_BINARY_DIR`)
|
|
|
+ Set to the path you wish to have the generated files placed. If it is
|
|
|
+ blank output files will be placed in :variable:`CMAKE_CURRENT_BINARY_DIR`.
|
|
|
+ Intermediate files will always be placed in
|
|
|
+ ``CMAKE_CURRENT_BINARY_DIR/CMakeFiles``.
|
|
|
+
|
|
|
+``CUDA_HOST_COMPILATION_CPP`` (Default: ``ON``)
|
|
|
+ Set to ``OFF`` for C compilation of host code.
|
|
|
+
|
|
|
+``CUDA_HOST_COMPILER`` (Default: ``CMAKE_C_COMPILER``)
|
|
|
+ Set the host compiler to be used by nvcc. Ignored if ``-ccbin`` or
|
|
|
+ ``--compiler-bindir`` is already present in the ``CUDA_NVCC_FLAGS`` or
|
|
|
+ ``CUDA_NVCC_FLAGS_<CONFIG>`` variables. For Visual Studio targets,
|
|
|
+ the host compiler is constructed with one or more visual studio macros
|
|
|
+ such as ``$(VCInstallDir)``, that expands out to the path when
|
|
|
+ the command is run from within VS.
|
|
|
+
|
|
|
+ .. versionadded:: 3.13
|
|
|
+ If the :envvar:`CUDAHOSTCXX` environment variable is set it will
|
|
|
+ be used as the default.
|
|
|
+
|
|
|
+``CUDA_NVCC_FLAGS``, ``CUDA_NVCC_FLAGS_<CONFIG>``
|
|
|
+ Additional NVCC command line arguments. NOTE: multiple arguments must be
|
|
|
+ semi-colon delimited (e.g. ``--compiler-options;-Wall``)
|
|
|
+
|
|
|
+ .. versionadded:: 3.6
|
|
|
+ Contents of these variables may use
|
|
|
+ :manual:`generator expressions <cmake-generator-expressions(7)>`.
|
|
|
+
|
|
|
+``CUDA_PROPAGATE_HOST_FLAGS`` (Default: ``ON``)
|
|
|
+ Set to ``ON`` to propagate :variable:`CMAKE_{C,CXX}_FLAGS <CMAKE_<LANG>_FLAGS>` and their configuration
|
|
|
+ dependent counterparts (e.g. ``CMAKE_C_FLAGS_DEBUG``) automatically to the
|
|
|
+ host compiler through nvcc's ``-Xcompiler`` flag. This helps make the
|
|
|
+ generated host code match the rest of the system better. Sometimes
|
|
|
+ certain flags give nvcc problems, and this will help you turn the flag
|
|
|
+ propagation off. This does not affect the flags supplied directly to nvcc
|
|
|
+ via ``CUDA_NVCC_FLAGS`` or through the ``OPTION`` flags specified through
|
|
|
+ ``cuda_add_library()``, ``cuda_add_executable()``, or ``cuda_wrap_srcs()``. Flags used for
|
|
|
+ shared library compilation are not affected by this flag.
|
|
|
|
|
|
- CUDA_NVCC_FLAGS
|
|
|
- CUDA_NVCC_FLAGS_<CONFIG>
|
|
|
- -- Additional NVCC command line arguments. NOTE: multiple arguments must be
|
|
|
- semi-colon delimited (e.g. --compiler-options;-Wall)
|
|
|
-
|
|
|
- CUDA_PROPAGATE_HOST_FLAGS (Default ON)
|
|
|
- -- Set to ON to propagate CMAKE_{C,CXX}_FLAGS and their configuration
|
|
|
- dependent counterparts (e.g. CMAKE_C_FLAGS_DEBUG) automatically to the
|
|
|
- host compiler through nvcc's -Xcompiler flag. This helps make the
|
|
|
- generated host code match the rest of the system better. Sometimes
|
|
|
- certain flags give nvcc problems, and this will help you turn the flag
|
|
|
- propagation off. This does not affect the flags supplied directly to nvcc
|
|
|
- via CUDA_NVCC_FLAGS or through the OPTION flags specified through
|
|
|
- CUDA_ADD_LIBRARY, CUDA_ADD_EXECUTABLE, or CUDA_WRAP_SRCS. Flags used for
|
|
|
- shared library compilation are not affected by this flag.
|
|
|
-
|
|
|
- CUDA_SEPARABLE_COMPILATION (Default OFF)
|
|
|
- -- If set this will enable separable compilation for all CUDA runtime object
|
|
|
- files. If used outside of CUDA_ADD_EXECUTABLE and CUDA_ADD_LIBRARY
|
|
|
- (e.g. calling CUDA_WRAP_SRCS directly),
|
|
|
- CUDA_COMPUTE_SEPARABLE_COMPILATION_OBJECT_FILE_NAME and
|
|
|
- CUDA_LINK_SEPARABLE_COMPILATION_OBJECTS should be called.
|
|
|
-
|
|
|
- CUDA_SOURCE_PROPERTY_FORMAT
|
|
|
- -- If this source file property is set, it can override the format specified
|
|
|
- to CUDA_WRAP_SRCS (OBJ, PTX, CUBIN, or FATBIN). If an input source file
|
|
|
- is not a .cu file, setting this file will cause it to be treated as a .cu
|
|
|
- file. See documentation for set_source_files_properties on how to set
|
|
|
- this property.
|
|
|
-
|
|
|
- CUDA_USE_STATIC_CUDA_RUNTIME (Default ON)
|
|
|
- -- When enabled the static version of the CUDA runtime library will be used
|
|
|
- in CUDA_LIBRARIES. If the version of CUDA configured doesn't support
|
|
|
- this option, then it will be silently disabled.
|
|
|
-
|
|
|
- CUDA_VERBOSE_BUILD (Default OFF)
|
|
|
- -- Set to ON to see all the commands used when building the CUDA file. When
|
|
|
- using a Makefile generator the value defaults to VERBOSE (run make
|
|
|
- VERBOSE=1 to see output), although setting CUDA_VERBOSE_BUILD to ON will
|
|
|
- always print the output.
|
|
|
-
|
|
|
-The script creates the following macros (in alphabetical order)::
|
|
|
-
|
|
|
- CUDA_ADD_CUFFT_TO_TARGET( cuda_target )
|
|
|
- -- Adds the cufft library to the target (can be any target). Handles whether
|
|
|
- you are in emulation mode or not.
|
|
|
-
|
|
|
- CUDA_ADD_CUBLAS_TO_TARGET( cuda_target )
|
|
|
- -- Adds the cublas library to the target (can be any target). Handles
|
|
|
- whether you are in emulation mode or not.
|
|
|
-
|
|
|
- CUDA_ADD_EXECUTABLE( cuda_target file0 file1 ...
|
|
|
- [WIN32] [MACOSX_BUNDLE] [EXCLUDE_FROM_ALL] [OPTIONS ...] )
|
|
|
- -- Creates an executable "cuda_target" which is made up of the files
|
|
|
- specified. All of the non CUDA C files are compiled using the standard
|
|
|
- build rules specified by CMAKE and the cuda files are compiled to object
|
|
|
- files using nvcc and the host compiler. In addition CUDA_INCLUDE_DIRS is
|
|
|
- added automatically to include_directories(). Some standard CMake target
|
|
|
- calls can be used on the target after calling this macro
|
|
|
- (e.g. set_target_properties and target_link_libraries), but setting
|
|
|
- properties that adjust compilation flags will not affect code compiled by
|
|
|
- nvcc. Such flags should be modified before calling CUDA_ADD_EXECUTABLE,
|
|
|
- CUDA_ADD_LIBRARY or CUDA_WRAP_SRCS.
|
|
|
-
|
|
|
- CUDA_ADD_LIBRARY( cuda_target file0 file1 ...
|
|
|
- [STATIC | SHARED | MODULE] [EXCLUDE_FROM_ALL] [OPTIONS ...] )
|
|
|
- -- Same as CUDA_ADD_EXECUTABLE except that a library is created.
|
|
|
-
|
|
|
- CUDA_BUILD_CLEAN_TARGET()
|
|
|
- -- Creates a convenience target that deletes all the dependency files
|
|
|
- generated. You should make clean after running this target to ensure the
|
|
|
- dependency files get regenerated.
|
|
|
-
|
|
|
- CUDA_COMPILE( generated_files file0 file1 ... [STATIC | SHARED | MODULE]
|
|
|
- [OPTIONS ...] )
|
|
|
- -- Returns a list of generated files from the input source files to be used
|
|
|
- with ADD_LIBRARY or ADD_EXECUTABLE.
|
|
|
-
|
|
|
- CUDA_COMPILE_PTX( generated_files file0 file1 ... [OPTIONS ...] )
|
|
|
- -- Returns a list of PTX files generated from the input source files.
|
|
|
-
|
|
|
- CUDA_COMPILE_FATBIN( generated_files file0 file1 ... [OPTIONS ...] )
|
|
|
- -- Returns a list of FATBIN files generated from the input source files.
|
|
|
-
|
|
|
- CUDA_COMPILE_CUBIN( generated_files file0 file1 ... [OPTIONS ...] )
|
|
|
- -- Returns a list of CUBIN files generated from the input source files.
|
|
|
-
|
|
|
- CUDA_COMPUTE_SEPARABLE_COMPILATION_OBJECT_FILE_NAME( output_file_var
|
|
|
- cuda_target
|
|
|
- object_files )
|
|
|
- -- Compute the name of the intermediate link file used for separable
|
|
|
- compilation. This file name is typically passed into
|
|
|
- CUDA_LINK_SEPARABLE_COMPILATION_OBJECTS. output_file_var is produced
|
|
|
- based on cuda_target the list of objects files that need separable
|
|
|
- compilation as specified by object_files. If the object_files list is
|
|
|
- empty, then output_file_var will be empty. This function is called
|
|
|
- automatically for CUDA_ADD_LIBRARY and CUDA_ADD_EXECUTABLE. Note that
|
|
|
- this is a function and not a macro.
|
|
|
-
|
|
|
- CUDA_INCLUDE_DIRECTORIES( path0 path1 ... )
|
|
|
- -- Sets the directories that should be passed to nvcc
|
|
|
- (e.g. nvcc -Ipath0 -Ipath1 ... ). These paths usually contain other .cu
|
|
|
- files.
|
|
|
-
|
|
|
-
|
|
|
- CUDA_LINK_SEPARABLE_COMPILATION_OBJECTS( output_file_var cuda_target
|
|
|
- nvcc_flags object_files)
|
|
|
- -- Generates the link object required by separable compilation from the given
|
|
|
- object files. This is called automatically for CUDA_ADD_EXECUTABLE and
|
|
|
- CUDA_ADD_LIBRARY, but can be called manually when using CUDA_WRAP_SRCS
|
|
|
- directly. When called from CUDA_ADD_LIBRARY or CUDA_ADD_EXECUTABLE the
|
|
|
- nvcc_flags passed in are the same as the flags passed in via the OPTIONS
|
|
|
- argument. The only nvcc flag added automatically is the bitness flag as
|
|
|
- specified by CUDA_64_BIT_DEVICE_CODE. Note that this is a function
|
|
|
- instead of a macro.
|
|
|
-
|
|
|
- CUDA_SELECT_NVCC_ARCH_FLAGS(out_variable [target_CUDA_architectures])
|
|
|
- -- Selects GPU arch flags for nvcc based on target_CUDA_architectures
|
|
|
- target_CUDA_architectures : Auto | Common | All | LIST(ARCH_AND_PTX ...)
|
|
|
- - "Auto" detects local machine GPU compute arch at runtime.
|
|
|
- - "Common" and "All" cover common and entire subsets of architectures
|
|
|
- ARCH_AND_PTX : NAME | NUM.NUM | NUM.NUM(NUM.NUM) | NUM.NUM+PTX
|
|
|
- NAME: Fermi Kepler Maxwell Kepler+Tegra Kepler+Tesla Maxwell+Tegra Pascal
|
|
|
- NUM: Any number. Only those pairs are currently accepted by NVCC though:
|
|
|
- 2.0 2.1 3.0 3.2 3.5 3.7 5.0 5.2 5.3 6.0 6.2
|
|
|
- Returns LIST of flags to be added to CUDA_NVCC_FLAGS in ${out_variable}
|
|
|
- Additionally, sets ${out_variable}_readable to the resulting numeric list
|
|
|
- Example:
|
|
|
- CUDA_SELECT_NVCC_ARCH_FLAGS(ARCH_FLAGS 3.0 3.5+PTX 5.2(5.0) Maxwell)
|
|
|
- LIST(APPEND CUDA_NVCC_FLAGS ${ARCH_FLAGS})
|
|
|
-
|
|
|
- More info on CUDA architectures: https://en.wikipedia.org/wiki/CUDA
|
|
|
- Note that this is a function instead of a macro.
|
|
|
-
|
|
|
- CUDA_WRAP_SRCS ( cuda_target format generated_files file0 file1 ...
|
|
|
- [STATIC | SHARED | MODULE] [OPTIONS ...] )
|
|
|
- -- This is where all the magic happens. CUDA_ADD_EXECUTABLE,
|
|
|
- CUDA_ADD_LIBRARY, CUDA_COMPILE, and CUDA_COMPILE_PTX all call this
|
|
|
- function under the hood.
|
|
|
-
|
|
|
- Given the list of files (file0 file1 ... fileN) this macro generates
|
|
|
- custom commands that generate either PTX or linkable objects (use "PTX" or
|
|
|
- "OBJ" for the format argument to switch). Files that don't end with .cu
|
|
|
- or have the HEADER_FILE_ONLY property are ignored.
|
|
|
-
|
|
|
- The arguments passed in after OPTIONS are extra command line options to
|
|
|
- give to nvcc. You can also specify per configuration options by
|
|
|
- specifying the name of the configuration followed by the options. General
|
|
|
- options must precede configuration specific options. Not all
|
|
|
- configurations need to be specified, only the ones provided will be used.
|
|
|
-
|
|
|
- OPTIONS -DFLAG=2 "-DFLAG_OTHER=space in flag"
|
|
|
- DEBUG -g
|
|
|
- RELEASE --use_fast_math
|
|
|
- RELWITHDEBINFO --use_fast_math;-g
|
|
|
- MINSIZEREL --use_fast_math
|
|
|
-
|
|
|
- For certain configurations (namely VS generating object files with
|
|
|
- CUDA_ATTACH_VS_BUILD_RULE_TO_CUDA_FILE set to ON), no generated file will
|
|
|
- be produced for the given cuda file. This is because when you add the
|
|
|
- cuda file to Visual Studio it knows that this file produces an object file
|
|
|
- and will link in the resulting object file automatically.
|
|
|
-
|
|
|
- This script will also generate a separate cmake script that is used at
|
|
|
- build time to invoke nvcc. This is for several reasons.
|
|
|
-
|
|
|
- 1. nvcc can return negative numbers as return values which confuses
|
|
|
- Visual Studio into thinking that the command succeeded. The script now
|
|
|
- checks the error codes and produces errors when there was a problem.
|
|
|
-
|
|
|
- 2. nvcc has been known to not delete incomplete results when it
|
|
|
- encounters problems. This confuses build systems into thinking the
|
|
|
- target was generated when in fact an unusable file exists. The script
|
|
|
- now deletes the output files if there was an error.
|
|
|
-
|
|
|
- 3. By putting all the options that affect the build into a file and then
|
|
|
- make the build rule dependent on the file, the output files will be
|
|
|
- regenerated when the options change.
|
|
|
-
|
|
|
- This script also looks at optional arguments STATIC, SHARED, or MODULE to
|
|
|
- determine when to target the object compilation for a shared library.
|
|
|
- BUILD_SHARED_LIBS is ignored in CUDA_WRAP_SRCS, but it is respected in
|
|
|
- CUDA_ADD_LIBRARY. On some systems special flags are added for building
|
|
|
- objects intended for shared libraries. A preprocessor macro,
|
|
|
- <target_name>_EXPORTS is defined when a shared library compilation is
|
|
|
- detected.
|
|
|
-
|
|
|
- Flags passed into add_definitions with -D or /D are passed along to nvcc.
|
|
|
-
|
|
|
-
|
|
|
-
|
|
|
-The script defines the following variables::
|
|
|
-
|
|
|
- CUDA_VERSION_MAJOR -- The major version of cuda as reported by nvcc.
|
|
|
- CUDA_VERSION_MINOR -- The minor version.
|
|
|
- CUDA_VERSION
|
|
|
- CUDA_VERSION_STRING -- CUDA_VERSION_MAJOR.CUDA_VERSION_MINOR
|
|
|
- CUDA_HAS_FP16 -- Whether a short float (float16,fp16) is supported.
|
|
|
-
|
|
|
- CUDA_TOOLKIT_ROOT_DIR -- Path to the CUDA Toolkit (defined if not set).
|
|
|
- CUDA_SDK_ROOT_DIR -- Path to the CUDA SDK. Use this to find files in the
|
|
|
- SDK. This script will not directly support finding
|
|
|
- specific libraries or headers, as that isn't
|
|
|
- supported by NVIDIA. If you want to change
|
|
|
- libraries when the path changes see the
|
|
|
- FindCUDA.cmake script for an example of how to clear
|
|
|
- these variables. There are also examples of how to
|
|
|
- use the CUDA_SDK_ROOT_DIR to locate headers or
|
|
|
- libraries, if you so choose (at your own risk).
|
|
|
- CUDA_INCLUDE_DIRS -- Include directory for cuda headers. Added automatically
|
|
|
- for CUDA_ADD_EXECUTABLE and CUDA_ADD_LIBRARY.
|
|
|
- CUDA_LIBRARIES -- Cuda RT library.
|
|
|
- CUDA_CUFFT_LIBRARIES -- Device or emulation library for the Cuda FFT
|
|
|
- implementation (alternative to:
|
|
|
- CUDA_ADD_CUFFT_TO_TARGET macro)
|
|
|
- CUDA_CUBLAS_LIBRARIES -- Device or emulation library for the Cuda BLAS
|
|
|
- implementation (alternative to:
|
|
|
- CUDA_ADD_CUBLAS_TO_TARGET macro).
|
|
|
- CUDA_cudart_static_LIBRARY -- Statically linkable cuda runtime library.
|
|
|
- Only available for CUDA version 5.5+
|
|
|
- CUDA_cudadevrt_LIBRARY -- Device runtime library.
|
|
|
- Required for separable compilation.
|
|
|
- CUDA_cupti_LIBRARY -- CUDA Profiling Tools Interface library.
|
|
|
- Only available for CUDA version 4.0+.
|
|
|
- CUDA_curand_LIBRARY -- CUDA Random Number Generation library.
|
|
|
- Only available for CUDA version 3.2+.
|
|
|
- CUDA_cusolver_LIBRARY -- CUDA Direct Solver library.
|
|
|
- Only available for CUDA version 7.0+.
|
|
|
- CUDA_cusparse_LIBRARY -- CUDA Sparse Matrix library.
|
|
|
- Only available for CUDA version 3.2+.
|
|
|
- CUDA_npp_LIBRARY -- NVIDIA Performance Primitives lib.
|
|
|
- Only available for CUDA version 4.0+.
|
|
|
- CUDA_nppc_LIBRARY -- NVIDIA Performance Primitives lib (core).
|
|
|
- Only available for CUDA version 5.5+.
|
|
|
- CUDA_nppi_LIBRARY -- NVIDIA Performance Primitives lib (image processing).
|
|
|
- Only available for CUDA version 5.5 - 8.0.
|
|
|
- CUDA_nppial_LIBRARY -- NVIDIA Performance Primitives lib (image processing).
|
|
|
- Only available for CUDA version 9.0.
|
|
|
- CUDA_nppicc_LIBRARY -- NVIDIA Performance Primitives lib (image processing).
|
|
|
- Only available for CUDA version 9.0.
|
|
|
- CUDA_nppicom_LIBRARY -- NVIDIA Performance Primitives lib (image processing).
|
|
|
- Only available for CUDA version 9.0 - 10.2.
|
|
|
- Replaced by nvjpeg.
|
|
|
- CUDA_nppidei_LIBRARY -- NVIDIA Performance Primitives lib (image processing).
|
|
|
- Only available for CUDA version 9.0.
|
|
|
- CUDA_nppif_LIBRARY -- NVIDIA Performance Primitives lib (image processing).
|
|
|
- Only available for CUDA version 9.0.
|
|
|
- CUDA_nppig_LIBRARY -- NVIDIA Performance Primitives lib (image processing).
|
|
|
- Only available for CUDA version 9.0.
|
|
|
- CUDA_nppim_LIBRARY -- NVIDIA Performance Primitives lib (image processing).
|
|
|
- Only available for CUDA version 9.0.
|
|
|
- CUDA_nppist_LIBRARY -- NVIDIA Performance Primitives lib (image processing).
|
|
|
- Only available for CUDA version 9.0.
|
|
|
- CUDA_nppisu_LIBRARY -- NVIDIA Performance Primitives lib (image processing).
|
|
|
- Only available for CUDA version 9.0.
|
|
|
- CUDA_nppitc_LIBRARY -- NVIDIA Performance Primitives lib (image processing).
|
|
|
- Only available for CUDA version 9.0.
|
|
|
- CUDA_npps_LIBRARY -- NVIDIA Performance Primitives lib (signal processing).
|
|
|
- Only available for CUDA version 5.5+.
|
|
|
- CUDA_nvcuvenc_LIBRARY -- CUDA Video Encoder library.
|
|
|
- Only available for CUDA version 3.2+.
|
|
|
- Windows only.
|
|
|
- CUDA_nvcuvid_LIBRARY -- CUDA Video Decoder library.
|
|
|
- Only available for CUDA version 3.2+.
|
|
|
- Windows only.
|
|
|
- CUDA_nvToolsExt_LIBRARY
|
|
|
- -- NVIDA CUDA Tools Extension library.
|
|
|
- Available for CUDA version 5+.
|
|
|
- CUDA_OpenCL_LIBRARY -- NVIDA CUDA OpenCL library.
|
|
|
- Available for CUDA version 5+.
|
|
|
+``CUDA_SEPARABLE_COMPILATION`` (Default: ``OFF``)
|
|
|
+ If set this will enable separable compilation for all CUDA runtime object
|
|
|
+ files. If used outside of ``cuda_add_executable()`` and ``cuda_add_library()``
|
|
|
+ (e.g. calling ``cuda_wrap_srcs()`` directly),
|
|
|
+ ``cuda_compute_separable_compilation_object_file_name()`` and
|
|
|
+ ``cuda_link_separable_compilation_objects()`` should be called.
|
|
|
+
|
|
|
+``CUDA_SOURCE_PROPERTY_FORMAT``
|
|
|
+ .. versionadded:: 3.3
|
|
|
+
|
|
|
+ If this source file property is set, it can override the format specified
|
|
|
+ to ``cuda_wrap_srcs()`` (``OBJ``, ``PTX``, ``CUBIN``, or ``FATBIN``). If an input source file
|
|
|
+ is not a ``.cu`` file, setting this file will cause it to be treated as a ``.cu``
|
|
|
+ file. See documentation for set_source_files_properties on how to set
|
|
|
+ this property.
|
|
|
+
|
|
|
+``CUDA_USE_STATIC_CUDA_RUNTIME`` (Default: ``ON``)
|
|
|
+ .. versionadded:: 3.3
|
|
|
+
|
|
|
+ When enabled the static version of the CUDA runtime library will be used
|
|
|
+ in ``CUDA_LIBRARIES``. If the version of CUDA configured doesn't support
|
|
|
+ this option, then it will be silently disabled.
|
|
|
+
|
|
|
+``CUDA_VERBOSE_BUILD`` (Default: ``OFF``)
|
|
|
+ Set to ``ON`` to see all the commands used when building the CUDA file. When
|
|
|
+ using a Makefile generator the value defaults to ``VERBOSE`` (run
|
|
|
+ ``make VERBOSE=1`` to see output), although setting ``CUDA_VERBOSE_BUILD`` to ``ON`` will
|
|
|
+ always print the output.
|
|
|
+
|
|
|
+Commands
|
|
|
+""""""""
|
|
|
+
|
|
|
+The script creates the following functions and macros (in alphabetical order):
|
|
|
+
|
|
|
+.. code-block:: cmake
|
|
|
+
|
|
|
+ cuda_add_cufft_to_target(<cuda_target>)
|
|
|
+
|
|
|
+Adds the cufft library to the target (can be any target). Handles whether
|
|
|
+you are in emulation mode or not.
|
|
|
+
|
|
|
+.. code-block:: cmake
|
|
|
+
|
|
|
+ cuda_add_cublas_to_target(<cuda_target>)
|
|
|
+
|
|
|
+Adds the cublas library to the target (can be any target). Handles
|
|
|
+whether you are in emulation mode or not.
|
|
|
+
|
|
|
+.. code-block:: cmake
|
|
|
+
|
|
|
+ cuda_add_executable(<cuda_target> <file>...
|
|
|
+ [WIN32] [MACOSX_BUNDLE] [EXCLUDE_FROM_ALL] [OPTIONS ...])
|
|
|
+
|
|
|
+Creates an executable ``<cuda_target>`` which is made up of the files
|
|
|
+specified. All of the non CUDA C files are compiled using the standard
|
|
|
+build rules specified by CMake and the CUDA files are compiled to object
|
|
|
+files using nvcc and the host compiler. In addition ``CUDA_INCLUDE_DIRS`` is
|
|
|
+added automatically to :command:`include_directories`. Some standard CMake target
|
|
|
+calls can be used on the target after calling this macro
|
|
|
+(e.g. :command:`set_target_properties` and :command:`target_link_libraries`), but setting
|
|
|
+properties that adjust compilation flags will not affect code compiled by
|
|
|
+nvcc. Such flags should be modified before calling ``cuda_add_executable()``,
|
|
|
+``cuda_add_library()`` or ``cuda_wrap_srcs()``.
|
|
|
+
|
|
|
+.. code-block:: cmake
|
|
|
+
|
|
|
+ cuda_add_library(<cuda_target> <file>...
|
|
|
+ [STATIC | SHARED | MODULE] [EXCLUDE_FROM_ALL] [OPTIONS ...])
|
|
|
+
|
|
|
+Same as ``cuda_add_executable()`` except that a library is created.
|
|
|
+
|
|
|
+.. code-block:: cmake
|
|
|
+
|
|
|
+ cuda_build_clean_target()
|
|
|
+
|
|
|
+Creates a convenience target that deletes all the dependency files
|
|
|
+generated. You should make clean after running this target to ensure the
|
|
|
+dependency files get regenerated.
|
|
|
+
|
|
|
+.. code-block:: cmake
|
|
|
+
|
|
|
+ cuda_compile(<generated_files> <file>... [STATIC | SHARED | MODULE]
|
|
|
+ [OPTIONS ...])
|
|
|
+
|
|
|
+Returns a list of generated files from the input source files to be used
|
|
|
+with :command:`add_library` or :command:`add_executable`.
|
|
|
+
|
|
|
+.. code-block:: cmake
|
|
|
+
|
|
|
+ cuda_compile_ptx(<generated_files> <file>... [OPTIONS ...])
|
|
|
+
|
|
|
+Returns a list of ``PTX`` files generated from the input source files.
|
|
|
+
|
|
|
+.. code-block:: cmake
|
|
|
+
|
|
|
+ cuda_compile_fatbin(<generated_files> <file>... [OPTIONS ...])
|
|
|
+
|
|
|
+.. versionadded:: 3.1
|
|
|
+
|
|
|
+Returns a list of ``FATBIN`` files generated from the input source files.
|
|
|
+
|
|
|
+.. code-block:: cmake
|
|
|
+
|
|
|
+ cuda_compile_cubin(<generated_files> <file>... [OPTIONS ...])
|
|
|
+
|
|
|
+.. versionadded:: 3.1
|
|
|
+
|
|
|
+Returns a list of ``CUBIN`` files generated from the input source files.
|
|
|
+
|
|
|
+.. code-block:: cmake
|
|
|
+
|
|
|
+ cuda_compute_separable_compilation_object_file_name(<output_file_var>
|
|
|
+ <cuda_target>
|
|
|
+ <object_files>)
|
|
|
+
|
|
|
+Compute the name of the intermediate link file used for separable
|
|
|
+compilation. This file name is typically passed into
|
|
|
+``CUDA_LINK_SEPARABLE_COMPILATION_OBJECTS``. output_file_var is produced
|
|
|
+based on cuda_target the list of objects files that need separable
|
|
|
+compilation as specified by ``<object_files>``. If the ``<object_files>`` list is
|
|
|
+empty, then ``<output_file_var>`` will be empty. This function is called
|
|
|
+automatically for ``cuda_add_library()`` and ``cuda_add_executable()``. Note that
|
|
|
+this is a function and not a macro.
|
|
|
+
|
|
|
+.. code-block:: cmake
|
|
|
+
|
|
|
+ cuda_include_directories(path0 path1 ...)
|
|
|
+
|
|
|
+Sets the directories that should be passed to nvcc
|
|
|
+(e.g. ``nvcc -Ipath0 -Ipath1 ...``). These paths usually contain other ``.cu``
|
|
|
+files.
|
|
|
+
|
|
|
+.. code-block:: cmake
|
|
|
+
|
|
|
+ cuda_link_separable_compilation_objects(<output_file_var> <cuda_target>
|
|
|
+ <nvcc_flags> <object_files>)
|
|
|
+
|
|
|
+Generates the link object required by separable compilation from the given
|
|
|
+object files. This is called automatically for ``cuda_add_executable()`` and
|
|
|
+``cuda_add_library()``, but can be called manually when using ``cuda_wrap_srcs()``
|
|
|
+directly. When called from ``cuda_add_library()`` or ``cuda_add_executable()`` the
|
|
|
+``<nvcc_flags>`` passed in are the same as the flags passed in via the ``OPTIONS``
|
|
|
+argument. The only nvcc flag added automatically is the bitness flag as
|
|
|
+specified by ``CUDA_64_BIT_DEVICE_CODE``. Note that this is a function
|
|
|
+instead of a macro.
|
|
|
+
|
|
|
+.. code-block:: cmake
|
|
|
+
|
|
|
+ cuda_select_nvcc_arch_flags(<out_variable> [<target_CUDA_architecture> ...])
|
|
|
+
|
|
|
+Selects GPU arch flags for nvcc based on ``target_CUDA_architecture``.
|
|
|
+
|
|
|
+Values for ``target_CUDA_architecture``:
|
|
|
+
|
|
|
+* ``Auto``: detects local machine GPU compute arch at runtime.
|
|
|
+* ``Common`` and ``All``: cover common and entire subsets of architectures.
|
|
|
+* ``<name>``: one of ``Fermi``, ``Kepler``, ``Maxwell``, ``Kepler+Tegra``, ``Kepler+Tesla``, ``Maxwell+Tegra``, ``Pascal``.
|
|
|
+* ``<ver>``, ``<ver>(<ver>)``, ``<ver>+PTX``, where ``<ver>`` is one of
|
|
|
+ ``2.0``, ``2.1``, ``3.0``, ``3.2``, ``3.5``, ``3.7``, ``5.0``, ``5.2``, ``5.3``, ``6.0``, ``6.2``.
|
|
|
+
|
|
|
+Returns list of flags to be added to ``CUDA_NVCC_FLAGS`` in ``<out_variable>``.
|
|
|
+Additionally, sets ``<out_variable>_readable`` to the resulting numeric list.
|
|
|
+
|
|
|
+Example::
|
|
|
+
|
|
|
+ cuda_select_nvcc_arch_flags(ARCH_FLAGS 3.0 3.5+PTX 5.2(5.0) Maxwell)
|
|
|
+ list(APPEND CUDA_NVCC_FLAGS ${ARCH_FLAGS})
|
|
|
+
|
|
|
+More info on CUDA architectures: https://en.wikipedia.org/wiki/CUDA.
|
|
|
+Note that this is a function instead of a macro.
|
|
|
+
|
|
|
+.. code-block:: cmake
|
|
|
+
|
|
|
+ cuda_wrap_srcs(<cuda_target> <format> <generated_files> <file>...
|
|
|
+ [STATIC | SHARED | MODULE] [OPTIONS ...])
|
|
|
+
|
|
|
+This is where all the magic happens. ``cuda_add_executable()``,
|
|
|
+``cuda_add_library()``, ``cuda_compile()``, and ``cuda_compile_ptx()`` all call this
|
|
|
+function under the hood.
|
|
|
+
|
|
|
+Given the list of files ``<file>...`` this macro generates
|
|
|
+custom commands that generate either PTX or linkable objects (use ``PTX`` or
|
|
|
+``OBJ`` for the ``<format>`` argument to switch). Files that don't end with ``.cu``
|
|
|
+or have the ``HEADER_FILE_ONLY`` property are ignored.
|
|
|
+
|
|
|
+The arguments passed in after ``OPTIONS`` are extra command line options to
|
|
|
+give to nvcc. You can also specify per configuration options by
|
|
|
+specifying the name of the configuration followed by the options. General
|
|
|
+options must precede configuration specific options. Not all
|
|
|
+configurations need to be specified, only the ones provided will be used.
|
|
|
+For example:
|
|
|
+
|
|
|
+.. code-block:: cmake
|
|
|
+
|
|
|
+ cuda_add_executable(...
|
|
|
+ OPTIONS -DFLAG=2 "-DFLAG_OTHER=space in flag"
|
|
|
+ DEBUG -g
|
|
|
+ RELEASE --use_fast_math
|
|
|
+ RELWITHDEBINFO --use_fast_math;-g
|
|
|
+ MINSIZEREL --use_fast_math)
|
|
|
+
|
|
|
+For certain configurations (namely VS generating object files with
|
|
|
+``CUDA_ATTACH_VS_BUILD_RULE_TO_CUDA_FILE`` set to ``ON``), no generated file will
|
|
|
+be produced for the given cuda file. This is because when you add the
|
|
|
+cuda file to Visual Studio it knows that this file produces an object file
|
|
|
+and will link in the resulting object file automatically.
|
|
|
+
|
|
|
+This script will also generate a separate cmake script that is used at
|
|
|
+build time to invoke nvcc. This is for several reasons:
|
|
|
+
|
|
|
+* nvcc can return negative numbers as return values which confuses
|
|
|
+ Visual Studio into thinking that the command succeeded. The script now
|
|
|
+ checks the error codes and produces errors when there was a problem.
|
|
|
+
|
|
|
+* nvcc has been known to not delete incomplete results when it
|
|
|
+ encounters problems. This confuses build systems into thinking the
|
|
|
+ target was generated when in fact an unusable file exists. The script
|
|
|
+ now deletes the output files if there was an error.
|
|
|
+
|
|
|
+* By putting all the options that affect the build into a file and then
|
|
|
+ make the build rule dependent on the file, the output files will be
|
|
|
+ regenerated when the options change.
|
|
|
+
|
|
|
+This script also looks at optional arguments ``STATIC``, ``SHARED``, or ``MODULE`` to
|
|
|
+determine when to target the object compilation for a shared library.
|
|
|
+:variable:`BUILD_SHARED_LIBS` is ignored in ``cuda_wrap_srcs()``, but it is respected in
|
|
|
+``cuda_add_library()``. On some systems special flags are added for building
|
|
|
+objects intended for shared libraries. A preprocessor macro,
|
|
|
+``<target_name>_EXPORTS`` is defined when a shared library compilation is
|
|
|
+detected.
|
|
|
+
|
|
|
+Flags passed into add_definitions with ``-D`` or ``/D`` are passed along to nvcc.
|
|
|
+
|
|
|
+Result Variables
|
|
|
+""""""""""""""""
|
|
|
+
|
|
|
+The script defines the following variables:
|
|
|
+
|
|
|
+``CUDA_VERSION_MAJOR``
|
|
|
+ The major version of cuda as reported by nvcc.
|
|
|
+
|
|
|
+``CUDA_VERSION_MINOR``
|
|
|
+ The minor version.
|
|
|
+
|
|
|
+``CUDA_VERSION``, ``CUDA_VERSION_STRING``
|
|
|
+ Full version in the ``X.Y`` format.
|
|
|
+
|
|
|
+``CUDA_HAS_FP16``
|
|
|
+ .. versionadded:: 3.6
|
|
|
+ Whether a short float (``float16``, ``fp16``) is supported.
|
|
|
+
|
|
|
+``CUDA_TOOLKIT_ROOT_DIR``
|
|
|
+ Path to the CUDA Toolkit (defined if not set).
|
|
|
+
|
|
|
+``CUDA_SDK_ROOT_DIR``
|
|
|
+ Path to the CUDA SDK. Use this to find files in the SDK. This script will
|
|
|
+ not directly support finding specific libraries or headers, as that isn't
|
|
|
+ supported by NVIDIA. If you want to change libraries when the path changes
|
|
|
+ see the ``FindCUDA.cmake`` script for an example of how to clear these
|
|
|
+ variables. There are also examples of how to use the ``CUDA_SDK_ROOT_DIR``
|
|
|
+ to locate headers or libraries, if you so choose (at your own risk).
|
|
|
+
|
|
|
+``CUDA_INCLUDE_DIRS``
|
|
|
+ Include directory for cuda headers. Added automatically
|
|
|
+ for ``cuda_add_executable()`` and ``cuda_add_library()``.
|
|
|
+
|
|
|
+``CUDA_LIBRARIES``
|
|
|
+ Cuda RT library.
|
|
|
+
|
|
|
+``CUDA_CUFFT_LIBRARIES``
|
|
|
+ Device or emulation library for the Cuda FFT implementation (alternative to
|
|
|
+ ``cuda_add_cufft_to_target()`` macro)
|
|
|
+
|
|
|
+``CUDA_CUBLAS_LIBRARIES``
|
|
|
+ Device or emulation library for the Cuda BLAS implementation (alternative to
|
|
|
+ ``cuda_add_cublas_to_target()`` macro).
|
|
|
+
|
|
|
+``CUDA_cudart_static_LIBRARY``
|
|
|
+ Statically linkable cuda runtime library.
|
|
|
+ Only available for CUDA version 5.5+.
|
|
|
+
|
|
|
+``CUDA_cudadevrt_LIBRARY``
|
|
|
+ .. versionadded:: 3.7
|
|
|
+ Device runtime library. Required for separable compilation.
|
|
|
+
|
|
|
+``CUDA_cupti_LIBRARY``
|
|
|
+ CUDA Profiling Tools Interface library.
|
|
|
+ Only available for CUDA version 4.0+.
|
|
|
+
|
|
|
+``CUDA_curand_LIBRARY``
|
|
|
+ CUDA Random Number Generation library.
|
|
|
+ Only available for CUDA version 3.2+.
|
|
|
+
|
|
|
+``CUDA_cusolver_LIBRARY``
|
|
|
+ .. versionadded:: 3.2
|
|
|
+ CUDA Direct Solver library.
|
|
|
+ Only available for CUDA version 7.0+.
|
|
|
+
|
|
|
+``CUDA_cusparse_LIBRARY``
|
|
|
+ CUDA Sparse Matrix library.
|
|
|
+ Only available for CUDA version 3.2+.
|
|
|
+
|
|
|
+``CUDA_npp_LIBRARY``
|
|
|
+ NVIDIA Performance Primitives lib.
|
|
|
+ Only available for CUDA version 4.0+.
|
|
|
+
|
|
|
+``CUDA_nppc_LIBRARY``
|
|
|
+ NVIDIA Performance Primitives lib (core).
|
|
|
+ Only available for CUDA version 5.5+.
|
|
|
+
|
|
|
+``CUDA_nppi_LIBRARY``
|
|
|
+ NVIDIA Performance Primitives lib (image processing).
|
|
|
+ Only available for CUDA version 5.5 - 8.0.
|
|
|
+
|
|
|
+``CUDA_nppial_LIBRARY``
|
|
|
+ NVIDIA Performance Primitives lib (image processing).
|
|
|
+ Only available for CUDA version 9.0.
|
|
|
+
|
|
|
+``CUDA_nppicc_LIBRARY``
|
|
|
+ NVIDIA Performance Primitives lib (image processing).
|
|
|
+ Only available for CUDA version 9.0.
|
|
|
+
|
|
|
+``CUDA_nppicom_LIBRARY``
|
|
|
+ NVIDIA Performance Primitives lib (image processing).
|
|
|
+ Only available for CUDA version 9.0 - 10.2.
|
|
|
+ Replaced by nvjpeg.
|
|
|
+
|
|
|
+``CUDA_nppidei_LIBRARY``
|
|
|
+ NVIDIA Performance Primitives lib (image processing).
|
|
|
+ Only available for CUDA version 9.0.
|
|
|
+
|
|
|
+``CUDA_nppif_LIBRARY``
|
|
|
+ NVIDIA Performance Primitives lib (image processing).
|
|
|
+ Only available for CUDA version 9.0.
|
|
|
+
|
|
|
+``CUDA_nppig_LIBRARY``
|
|
|
+ NVIDIA Performance Primitives lib (image processing).
|
|
|
+ Only available for CUDA version 9.0.
|
|
|
+
|
|
|
+``CUDA_nppim_LIBRARY``
|
|
|
+ NVIDIA Performance Primitives lib (image processing).
|
|
|
+ Only available for CUDA version 9.0.
|
|
|
+
|
|
|
+``CUDA_nppist_LIBRARY``
|
|
|
+ NVIDIA Performance Primitives lib (image processing).
|
|
|
+ Only available for CUDA version 9.0.
|
|
|
+
|
|
|
+``CUDA_nppisu_LIBRARY``
|
|
|
+ NVIDIA Performance Primitives lib (image processing).
|
|
|
+ Only available for CUDA version 9.0.
|
|
|
+
|
|
|
+``CUDA_nppitc_LIBRARY``
|
|
|
+ NVIDIA Performance Primitives lib (image processing).
|
|
|
+ Only available for CUDA version 9.0.
|
|
|
+
|
|
|
+``CUDA_npps_LIBRARY``
|
|
|
+ NVIDIA Performance Primitives lib (signal processing).
|
|
|
+ Only available for CUDA version 5.5+.
|
|
|
+
|
|
|
+``CUDA_nvcuvenc_LIBRARY``
|
|
|
+ CUDA Video Encoder library.
|
|
|
+ Only available for CUDA version 3.2+.
|
|
|
+ Windows only.
|
|
|
+
|
|
|
+``CUDA_nvcuvid_LIBRARY``
|
|
|
+ CUDA Video Decoder library.
|
|
|
+ Only available for CUDA version 3.2+.
|
|
|
+ Windows only.
|
|
|
+
|
|
|
+``CUDA_nvToolsExt_LIBRARY``
|
|
|
+ .. versionadded:: 3.16
|
|
|
+ NVIDA CUDA Tools Extension library.
|
|
|
+ Available for CUDA version 5+.
|
|
|
+
|
|
|
+``CUDA_OpenCL_LIBRARY``
|
|
|
+ .. versionadded:: 3.16
|
|
|
+ NVIDA CUDA OpenCL library.
|
|
|
+ Available for CUDA version 5+.
|
|
|
|
|
|
#]=======================================================================]
|
|
|
|