User:Uli/R-Car M3-W Mainline GPU Test (2024)

Results of GPU test programs run on the Renesas R-Car M3-W Salvator-X board.

Contents

  • 1 Creating the test setup
    • 1.1 Building the user space (root) file system
    • 1.2 Running tests
    • 1.3 Building ltrace
  • 2 Failing tests
    • 2.1 gles2test1
    • 2.2 rgx_blit_test
  • 3 Alternative binary blobs
  • 4 Passing tests
    • 4.1 rogue2d_unittest
    • 4.2 rgx_kicksync_test
    • 4.3 pvr_memory_test
    • 4.4 rgx_compute_test
    • 4.5 ocl_unit_test

Building the user space (root) file system

This builds an ARM64 Debian userland on a Debian-derived system to be used as an NFS root file system:

  1. Run sudo qemu-debootstrap --arch arm64 sid /tmp/nfs_root http://deb.debian.org/debian/.
  2. Edit /tmp/nfs_root/etc/shadow and replace root:*:... with root::....
  3. Edit /tmp/nfs_root/etc/resolv.conf and enter a working nameserver address.
  4. Extract all the nested archives in R-Car_Gen3_Series_Evaluation_Software_Package_for_Linux-20170828.zip, then:
    rsync -av R-Car_Gen3_Series_Evaluation_Software_Package_for_Linux/m3_gfxeva_packages/EVARTM0RC7796GLTG0001SL40C_1_3_4/EVARTM0RC7796GLTG0001SL40C/Software/rogue/ /tmp/nfs_root/
  5. Build the GPU driver module:
    cd build/linux/r8a7796ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- KERNELDIR=<path to kernel source> make BUILD=release
    Then copy the driver module pvrsrvkm.ko to your userland file system.
  6. Copy the Yocto binary-only libraries to the root file system:
    sudo cp -a ./Yocto223_OCL_20171129/opencl-ddk/12-hour/rogue /tmp/nfs_root/root/yocto

Boot the NFS root file system and continue on the target platform:

  1. Run apt-get update.
  2. Install a number of packages:
    apt-get install build-essential libdrm-dev openssh-server \ libgbm-dev git openssh-server autotools-dev autoconf automake \ pkg-config libtool-bin libwayland-bin libwayland-dev make \ libudev-dev libelf-dev libunwind-dev
  3. If you want to log in remotely via ssh:
    1. Set a root password.
    2. edit /etc/ssh/sshd_config and change #PermitRootLogin prohibit-password to PermitRootLogin yes.
    3. Run /etc/init.d/ssh restart.
  4. Build wayland-kms:
    git clone https://github.com/renesas-rcar/wayland-kms.gitcd wayland-kmsautoreconf -i./configuremakemake installldconfig
  5. Create a number of missing symlinks:
    cd /root/yocto/usr/libln -s libEGL.so libEGL.so.1ln -s libGLES_CM.so libGLES_CM.so.1ln -s libGLESv2.so libGLESv2.so.1

Running tests

Run the tests provided with the Yocto binary-only libraries like so:

LD_LIBRARY_PATH=/root/yocto/usr/lib /root/yocto/usr/local/bin/<test program>

Building ltrace

ltrace is, surprisingly, not a standard Debian package. To build it locally, run:

git clone https://github.com/dkogan/ltrace.gitcd ltrace./autogen.sh./configure --disable-werrormakemake install

In general, all tests using EGL fail early on, with the eglInitialize() function returning error 0x3001 (EGL_NOT_INITIALIZED).

The reason for this is unclear, as there is no documented facility that would allow introspection into the workings of the binary-only libraries shipped as part of the Yocto GPU support package linked above.

Given that all tests that do not make use of either EGL or libsutu_display.so pass, I am inclined to think it is a user space incompatibility between the Debian user space and the Yocto libraries, rather than a driver problem.

(Not all failing tests are documented here as the failure mode is identical in all cases.)

gles2test1

--------------------- started ---------------------'eglInitialize' returned egl error 'EGL_NOT_INITIALIZED' (0x3001)

ltrace log of the test run:

[pid 747] __libc_start_main([ "/root/yocto/usr/local/bin/gles2t"... ] <unfinished ...>[pid 747] puts("--------------------- started --"...) = 52[pid 747] eglGetDisplay(0, 0, 1, 0) = 1[pid 747] eglInitialize(1, 0xffffdddf1574, 0xffffdddf1578, 0) = 0[pid 747] eglGetError(0xaaaab25450a0, 0, 0xffffbb4afbd0, 0) = 0x3001[pid 747] printf("'%s' returned egl error '%s' (0x"..., "eglInitialize", "EGL_NOT_INITIALIZED", 0x3001) = 66[pid 747] exit(1 <unfinished ...>[pid 747] __cxa_finalize(0xaaaab2555828, 0xaaaab2543f60, 0xaaaab2542000, 1) = 0--------------------- started ---------------------'eglInitialize' returned egl error 'EGL_NOT_INITIALIZED' (0x3001)[pid 747] +++ exited (status 1) +++

It has been verified via ldd that the correct (i.e. binary-only) libraries are being used in the test run.

rgx_blit_test

This is an example of a failing test using libsutu_display.so.

------------------- rgx blit test ----------------------------------------- Start ------------------------Using display type: DRM/KMSCall PVRSRVConnect with a valid argument: OKAttempt to create device memory context: OKCreating synchronization context: OK(unittests/services/common/sutu_display/sutu_drm.c:1213) Invalid pointer (psDisplay->psOutput == (nil))

ltrace log:

[pid 741] __libc_start_main([ "/root/yocto/usr/local/bin/rgx_bl"... ] <unfinished ...>[pid 741] puts("------------------- rgx blit tes"...) = 54[pid 741] puts("---------------------- Start ---"...) = 54[pid 741] sutu_DisplayGetTypeName(54, 0, 0x8b160887277a7100, 0) = 0xffffa604bb68[pid 741] printf("Using display type:%s\n", "DRM/KMS") = 28[pid 741] PVRSRVConnect(0xfffffe88c650, 0, 0x8b160887277a7100, 0xffffa6256b20) = 0[pid 741] puts(" OK") = 4[pid 741] PVRSRVCreateDeviceMemContext(0xaaab0ace4020, 0xfffffe88c658, 0x8b160887277a7100, 0xffffa6256b20) = 0[pid 741] puts(" OK") = 4[pid 741] PVRSRVSyncPrimContextCreate(0xaaab0ace4020, 0xfffffe88c670, 0x8b160887277a7100, 0xffffa6256b20) = 0[pid 741] puts(" OK") = 4[pid 741] sutu_DisplayCreateContext(0xaaab0ace4020, 0xaaab0ace4180, 0xfffffe88c620, 0 <no return ...>[pid 741] --- SIGABRT (Aborted) ---[pid 741] +++ killed by SIGABRT +++

The firmware required for the particular GPU version in the M3-W SoC (rgx.fw.4.45.2.58) is included in the file R-Car_Gen3_Series_Evaluation_Software_Package_for_Linux-20170828.zip, which also contains the original driver sources and a different set of binary-blob libraries. One would be inclined to assume that these components are more compatible with one another, but any tests run with them fail in the same way as with the Yocto libraries.

rogue2d_unittest

Test succeededWriting dst.bin size 65536end of rogue2d unit test

rgx_kicksync_test

rgx_kicksync_test test configuration:Verbose: NNum contexts: 2Num syncs per context: 16Num loops: 1Sync first value: 0x00000001Num syncs per command: 3Num commands per loop: 32Verify every command: NDelay (ms): 0Random: N----------------------- Start ---------------------------------------------- Loop 1 / 1 -----------------------Initialising contextsSubmitting 32 commands, each with 3 syncsAll commands submittedVerifying all syncs have their expected final valueAll syncs have their expected valueReleasing contextsTest successful----------------------- End -----------------------

pvr_memory_test

Profiling 4MB block-size transfer (512 times, 2048MB) using 4K-aligned (12) buffer-------------------- Start tests (+ Memcpy) --------------------Loop count = 512CPU(ZeroPg) -> CPU(ZeroPg) started: Moving 512 blocks of 4194304 bytes = 2048Mbytes per testtook 1.168463394 seconds = 1752.7Mbytes/secCPU(ZeroPg) -> CPU(Cached) started: Moving 512 blocks of 4194304 bytes = 2048Mbytes per testtook 1.161539246 seconds = 1763.2Mbytes/secCPU(Cached) -> CPU(Cached) started: Moving 512 blocks of 4194304 bytes = 2048Mbytes per testtook 2.44308258 seconds = 1001.8Mbytes/secCPU(Cached) -> DEV(Cached) started: Moving 512 blocks of 4194304 bytes = 2048Mbytes per testtook 2.48023952 seconds = 1000.0Mbytes/secCPU(Cached) -> DEV(Cached) + Flush started: Moving 512 blocks of 4194304 bytes = 2048Mbytes per testtook 2.671264205 seconds = 766.7Mbytes/secDEV(Cached) -> CPU(Cached) started: Moving 512 blocks of 4194304 bytes = 2048Mbytes per testtook 2.34345457 seconds = 1006.7Mbytes/secCPU(Cached) -> DEV(Uncached) started: Moving 512 blocks of 4194304 bytes = 2048Mbytes per testtook 3.235934264 seconds = 632.9Mbytes/secDEV(Uncached) -> CPU(Cached) started: Moving 512 blocks of 4194304 bytes = 2048Mbytes per testtook 16.159380557 seconds = 126.7Mbytes/secCPU(Cached) -> DEV(Write-Combined) started: Moving 512 blocks of 4194304 bytes = 2048Mbytes per testtook 1.838601274 seconds = 1113.9Mbytes/secDEV(Write-Combined) -> CPU(Cached) started: Moving 512 blocks of 4194304 bytes = 2048Mbytes per testtook 10.265888423 seconds = 199.5Mbytes/sec--------------------- End tests ---------------------Profiling 4MB block-size transfer (16 times, 64MB) using 4K-aligned (12) buffer-------------------- Start tests (+ Memcpy) --------------------Loop count = 16CPU(ZeroPg) -> CPU(ZeroPg) started: Moving 16 blocks of 4194304 bytes = 64Mbytes per testtook 0.43178572 seconds = 1482.2Mbytes/secCPU(ZeroPg) -> CPU(Cached) started: Moving 16 blocks of 4194304 bytes = 64Mbytes per testtook 0.36431785 seconds = 1756.7Mbytes/secCPU(Cached) -> CPU(Cached) started: Moving 16 blocks of 4194304 bytes = 64Mbytes per testtook 0.64104496 seconds = 998.4Mbytes/secCPU(Cached) -> DEV(Cached) started: Moving 16 blocks of 4194304 bytes = 64Mbytes per testtook 0.64066577 seconds = 999.0Mbytes/secCPU(Cached) -> DEV(Cached) + Flush started: Moving 16 blocks of 4194304 bytes = 64Mbytes per testtook 0.83345853 seconds = 767.9Mbytes/secDEV(Cached) -> CPU(Cached) started: Moving 16 blocks of 4194304 bytes = 64Mbytes per testtook 0.65629823 seconds = 975.2Mbytes/secCPU(Cached) -> DEV(Uncached) started: Moving 16 blocks of 4194304 bytes = 64Mbytes per testtook 0.104447218 seconds = 612.7Mbytes/secDEV(Uncached) -> CPU(Cached) started: Moving 16 blocks of 4194304 bytes = 64Mbytes per testtook 0.506740307 seconds = 126.3Mbytes/secCPU(Cached) -> DEV(Write-Combined) started: Moving 16 blocks of 4194304 bytes = 64Mbytes per testtook 0.57340429 seconds = 1116.1Mbytes/secDEV(Write-Combined) -> CPU(Cached) started: Moving 16 blocks of 4194304 bytes = 64Mbytes per testtook 0.321706407 seconds = 198.9Mbytes/sec--------------------- End tests ---------------------

rgx_compute_test

------------------ RGX compute test ---------------------------------------- Start -----------------------Call PVRSRVConnect with a valid argument: OKAttempt to create device memory context: OKCreating synchronization context: OKLooking up General heap handle OKGetting event object OKCreating Compute Context OKCreating BufferCreating DWord for CDM Event Object OK OKCreate PDS Heap OKCreate USC Heap OKAllocate sync primitive OKCreating NOP instructionCreating Data SegmentCreating Code SegmentWrite Kernel 0Creating Fence Data SegmentCreating Code SegmentWrite Fence KernelWrite TerminateCall services to kick CDM OKPoll for sync update OKPoll for CDM event object data OKDestroy Compute Context OKTotal time: 0msDestroy synchronization context:Destroy Device Memory ContextDisconnect from services: OK------------------------ End ------------------------

ocl_unit_test

OpenCL Unit Test(s) (Rogue_DDK_Linux rogueddk 1.7@4563938,release,r8a7795_linux) at Fri Oct 6 08:40:47 201700******************************************************************************Platform Test:Checks that an OpenCL compatible platform is presentfor the unit test to run.********************************************************************************Verify_Platform: Enumerating 1 platformsVerify_Platform: CL_PLATFORM_PROFILE EMBEDDED_PROFILEVerify_Platform: CL_PLATFORM_VERSION OpenCL 1.2 Verify_Platform: CL_PLATFORM_NAME PowerVR RogueVerify_Platform: CL_PLATFORM_VENDOR Imagination TechnologiesVerify_Platform: CL_PLATFORM_EXTENSIONS 'cl_khr_icd cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_egl_image cl_img_yuv_image cles_khr_int64 cl_img_cached_allocations cl_img_use_gralloc_ptr cl_khr_spir'platform --> passedplatform: Test took 0.22 seconds to run:platform: Verify 0.22s (100.00)%01******************************************************************************Device Test:Checks that an OpenCL compatible device is presentfor the unit test to run.********************************************************************************Verify_Device: Enumerating 1 devicesVerify_Device: CL_DEVICE_TYPE CL_DEVICE_TYPE_GPUVerify_Device: CL_DEVICE_NAME PowerVR Rogue GX6650Verify_Device: CL_DEVICE_VENDORI Imagination TechnologiesVerify_Device: CL_DRIVER_VERSION 1.7@4563938Verify_Device: CL_DEVICE_PROFILE EMBEDDED_PROFILEVerify_Device: CL_DEVICE_VERSION OpenCL 1.2 Verify_Device: CL_DEVICE_EXTENSIONS cl_khr_icd cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_egl_image cl_img_yuv_image cles_khr_int64 cl_img_cached_allocations cl_img_use_gralloc_ptr cl_khr_spirdevice --> passeddevice: Test took 0.34 seconds to run:device: Verify 0.34s (100.00)%02******************************************************************************Bounding Box:Runs the kernel that compute floating point minimum and maximumfor large number of vertices.******************************************************************************** Objects: 100, # triangles per object: 15393, Computing instances per object: 512, Time - start - stop: 164.080000msbbox --> passedbbox: Test took 2.87 seconds to run:bbox: Init 2.56s (89.19%) Verify 0.31s (10.81)%03******************************************************************************Addition Kernel:Performs an online compilation of an integer addition kerneland verifies that the output buffer is correct.********************************************************************************Compute_Add: Online compilation test with 4096 instances running source:<source>__kernel void AdditionKernel(__global int* a, __global int* b){int ith = get_global_id(0);a[ith] = a[ith] + b[ith];}</source>Verify_Add: Verification OKadd --> passedadd: Test took 0.09 seconds to run:add: Init 0.02s (21.98%) Compute 0.02s (23.08%)Verify 0.05s (54.95)%04******************************************************************************Binary Check:Runs the same kernel as the addition test however first savesthe binary version to the filsystem, recreates the OpenCLcontext and ensures the binary test computes the same resultsas the online test.********************************************************************************CL_PROGRAM_BUILD_LOG:Init_Binary: Wrote out 1 binaries to the file systemCompute_Binary: Successfully loaded back binary_0.bin from filesystemVerify_Binary: Binary file successfully executed kernelbinary --> passedbinary: Test took 0.18 seconds to run:binary: Init 0.09s (46.70%) Compute 0.04s (19.23%)Verify 0.06s (34.07)%05******************************************************************************Error Log Check:Runs an illegal kernel that contains a simple undeclaredidentifier error and verifies that the build log providedby the OpenCL implementation is correct.********************************************************************************errorlog:*** Build Log ***BuildGroup_1:3:2: error: use of undeclared identifier 'undeclared' undeclared identifier ^errorlog: Verified build log contained required error.errorlog --> passederrorlog: Test took 0.09 seconds to run:errorlog: Init 0.01s (10.00%) Verify 0.08s (90.00)%06******************************************************************************Memory Copy Kernel:Performs an online compilation of a kernel which copies inputto output, verifying the results and calculating the speed atwhich the data is transferred.******************************************************************************** BufferType: uchar, Instances: 524288, Copies Per Instance: 32, Copied: 16 MBs, Time: 0.068670s, 232.998398MB/s BufferType: ushort, Instances: 262144, Copies Per Instance: 32, Copied: 16 MBs, Time: 0.128205s, 124.800125MB/s BufferType: uint, Instances: 131072, Copies Per Instance: 32, Copied: 16 MBs, Time: 0.026237s, 609.825819MB/s BufferType: uint2, Instances: 65536, Copies Per Instance: 32, Copied: 16 MBs, Time: 0.092728s, 172.547666MB/s BufferType: uint4, Instances: 32768, Copies Per Instance: 32, Copied: 16 MBs, Time: 0.023522s, 680.214267MB/s BufferType: uint8, Instances: 16384, Copies Per Instance: 32, Copied: 16 MBs, Time: 0.025302s, 632.361078MB/s BufferType: uint16, Instances: 8192, Copies Per Instance: 32, Copied: 16 MBs, Time: 0.015503s, 1032.058311MB/smemcpy --> passedmemcpy: Test took 1.90 seconds to run:memcpy: Init 0.08s (3.95%) Verify 1.82s (96.05)%07******************************************************************************Memory Strided Copy Kernel:Performs an online compilation of a kernel which copies inputto output, verifying the results and calculating the speed atwhich the data is transferred using strided pattern.******************************************************************************** BufferType: uchar, Instances: 524288, Copies Per Instance: 32, Copied: 16 MBs, Workgroup Size: 256, Time: 0.057483s, 278.343162MB/s BufferType: ushort, Instances: 262144, Copies Per Instance: 32, Copied: 16 MBs, Workgroup Size: 128, Time: 0.032100s, 498.442368MB/s BufferType: uint, Instances: 131072, Copies Per Instance: 32, Copied: 16 MBs, Workgroup Size: 64, Time: 0.013142s, 1217.470705MB/s BufferType: uint2, Instances: 65536, Copies Per Instance: 32, Copied: 16 MBs, Workgroup Size: 32, Time: 0.013185s, 1213.500190MB/s BufferType: uint4, Instances: 32768, Copies Per Instance: 32, Copied: 16 MBs, Workgroup Size: 16, Time: 0.012212s, 1310.186702MB/s BufferType: uint8, Instances: 16384, Copies Per Instance: 32, Copied: 16 MBs, Workgroup Size: 8, Time: 0.012611s, 1268.733645MB/s BufferType: uint16, Instances: 8192, Copies Per Instance: 32, Copied: 16 MBs, Workgroup Size: 4, Time: 0.012611s, 1268.733645MB/smemcpy_stride --> passedmemcpy_stride: Test took 1.42 seconds to run:memcpy_stride: Init 0.06s (4.03%) Verify 1.36s (95.97)%08******************************************************************************Memory Read Kernel:Performs an online compilation of a kernel which reads largeamounts of data each instance, calculating read bandwidth.********************************************************************************Performing 32 memory reads per kernel instance, totalling 16777216 memory loads... BufferType: uchar, Instances: 524288, Reads per instance: 128, Read: 64 MBs, Time: 0.157612s, 406.060452MB/sPerforming 32 memory reads per kernel instance, totalling 8388608 memory loads... BufferType: ushort, Instances: 262144, Reads per instance: 128, Read: 64 MBs, Time: 0.635605s, 100.691467MB/sPerforming 32 memory reads per kernel instance, totalling 4194304 memory loads... BufferType: uint, Instances: 131072, Reads per instance: 128, Read: 64 MBs, Time: 0.039797s, 1608.161419MB/sPerforming 32 memory reads per kernel instance, totalling 2097152 memory loads... BufferType: uint2, Instances: 65536, Reads per instance: 128, Read: 64 MBs, Time: 0.037831s, 1691.734292MB/sPerforming 32 memory reads per kernel instance, totalling 1048576 memory loads... BufferType: uint4, Instances: 32768, Reads per instance: 128, Read: 64 MBs, Time: 0.037738s, 1695.903334MB/sPerforming 32 memory reads per kernel instance, totalling 524288 memory loads... BufferType: uint8, Instances: 16384, Reads per instance: 128, Read: 64 MBs, Time: 0.038040s, 1682.439537MB/sPerforming 32 memory reads per kernel instance, totalling 262144 memory loads... BufferType: uint16, Instances: 8192, Reads per instance: 128, Read: 64 MBs, Time: 0.020294s, 3153.641470MB/smemread --> passedmemread: Test took 2.52 seconds to run:memread: Init 0.12s (4.64%) Verify 2.40s (95.36)%09******************************************************************************Memory Strided Read Kernel:Performs an online compilation of a kernel which reads largeamounts of data each instance using stride pattern, calculating read bandwidth.********************************************************************************Performing 32 memory reads per kernel instance, totalling 16777216 memory loads... BufferType: uchar, Instances: 524288, Reads per instance: 128, WG size 256, Read: 64 MBs, Time: 0.077018s, 830.974577MB/sPerforming 32 memory reads per kernel instance, totalling 8388608 memory loads... BufferType: ushort, Instances: 262144, Reads per instance: 128, WG size 128, Read: 64 MBs, Time: 0.046774s, 1368.281524MB/sPerforming 32 memory reads per kernel instance, totalling 4194304 memory loads... BufferType: uint, Instances: 131072, Reads per instance: 128, WG size 64, Read: 64 MBs, Time: 0.020116s, 3181.547027MB/sPerforming 32 memory reads per kernel instance, totalling 2097152 memory loads... BufferType: uint2, Instances: 65536, Reads per instance: 128, WG size 32, Read: 64 MBs, Time: 0.019099s, 3350.960783MB/sPerforming 32 memory reads per kernel instance, totalling 1048576 memory loads... BufferType: uint4, Instances: 32768, Reads per instance: 128, WG size 16, Read: 64 MBs, Time: 0.019511s, 3280.200912MB/sPerforming 32 memory reads per kernel instance, totalling 524288 memory loads... BufferType: uint8, Instances: 16384, Reads per instance: 128, WG size 8, Read: 64 MBs, Time: 0.019315s, 3313.486927MB/sPerforming 32 memory reads per kernel instance, totalling 262144 memory loads... BufferType: uint16, Instances: 8192, Reads per instance: 128, WG size 4, Read: 64 MBs, Time: 0.018088s, 3538.257408MB/smemread_stride --> passedmemread_stride: Test took 1.39 seconds to run:memread_stride: Init 0.12s (8.59%) Verify 1.27s (91.41)%10******************************************************************************Memory Write Kernel:Performs an online compilation of a kernel which writes largeamounts of data each instance, calculating write bandwidth.******************************************************************************** BufferType: uchar, Instances: 524288, Writes Per Instance: 128, Written: 64 MBs, Time: 0.278133s, 230.105741MB/s BufferType: ushort, Instances: 262144, Writes Per Instance: 128, Written: 64 MBs, Time: 0.994779s, 64.335898MB/s BufferType: uint, Instances: 131072, Writes Per Instance: 128, Written: 64 MBs, Time: 0.749035s, 85.443270MB/s BufferType: uint2, Instances: 65536, Writes Per Instance: 128, Written: 64 MBs, Time: 0.272591s, 234.783980MB/s BufferType: uint4, Instances: 32768, Writes Per Instance: 128, Written: 64 MBs, Time: 0.146211s, 437.723564MB/s BufferType: uint8, Instances: 16384, Writes Per Instance: 128, Written: 64 MBs, Time: 0.080468s, 795.347219MB/s BufferType: uint16, Instances: 8192, Writes Per Instance: 128, Written: 64 MBs, Time: 0.045449s, 1408.171797MB/smemwrite --> passedmemwrite: Test took 10.17 seconds to run:memwrite: Init 0.64s (6.25%) Verify 9.53s (93.75)%11******************************************************************************Memory Strided Write Kernel:Performs an online compilation of a kernel which writes largeamounts of data each instance using stride pattern, calculating write bandwidth.******************************************************************************** BufferType: uchar, Instances: 524288, Writes Per Instance: 128, Written: 64 MBs, Time: 0.126440s, 506.168934MB/s BufferType: ushort, Instances: 262144, Writes Per Instance: 128, Written: 64 MBs, Time: 0.074213s, 862.382601MB/s BufferType: uint, Instances: 131072, Writes Per Instance: 128, Written: 64 MBs, Time: 0.048326s, 1324.338865MB/s BufferType: uint2, Instances: 65536, Writes Per Instance: 128, Written: 64 MBs, Time: 0.030605s, 2091.161575MB/s BufferType: uint4, Instances: 32768, Writes Per Instance: 128, Written: 64 MBs, Time: 0.031541s, 2029.104974MB/s BufferType: uint8, Instances: 16384, Writes Per Instance: 128, Written: 64 MBs, Time: 0.030891s, 2071.800848MB/s BufferType: uint16, Instances: 8192, Writes Per Instance: 128, Written: 64 MBs, Time: 0.032031s, 1998.064375MB/smemwrite_stride --> passedmemwrite_stride: Test took 8.61 seconds to run:memwrite_stride: Init 0.95s (10.99%) Verify 7.67s (89.01)%12******************************************************************************Image Copy Kernel:Performs an image direct copy.********************************************************************************Init_ImgCopyKernel: Hardware detected, verification will be run.ImgCpy_RunImgCpy: Instances: 128x512, Iterations: 8, Copied 2.00 MBs, Time 0.006000s, 6 Ticks, 333.333333MB/sImgCpy_RunImgCpy: Instances: 128x512, Iterations: 16, Copied 4.00 MBs, Time 0.007000s, 7 Ticks, 571.428571MB/sImgCpy_RunImgCpy: Instances: 128x512, Iterations: 32, Copied 8.00 MBs, Time 0.014000s, 14 Ticks, 571.428571MB/sImgCpy_RunImgCpy: Instances: 128x512, Iterations: 64, Copied 16.00 MBs, Time 0.027000s, 27 Ticks, 592.592593MB/sImgCpy_RunImgCpy: Instances: 128x512, Iterations: 128, Copied 32.00 MBs, Time 0.051000s, 51 Ticks, 627.450980MB/sImgCpy_RunImgCpy: Instances: 128x512, Iterations: 256, Copied 64.00 MBs, Time 0.102000s, 102 Ticks, 627.450980MB/sImgCpy_RunImgCpy: Instances: 128x512, Iterations: 512, Copied 128.00 MBs, Time 0.203000s, 203 Ticks, 630.541872MB/sImgCpy_RunImgCpy: Instances: 128x512, Iterations: 1024, Copied 256.00 MBs, Time 0.406000s, 406 Ticks, 630.541872MB/sImgCpy_RunImgCpy: Instances: 512x512, Iterations: 8, Copied 2.00 MBs, Time 0.008000s, 8 Ticks, 250.000000MB/sImgCpy_RunImgCpy: Instances: 512x512, Iterations: 16, Copied 4.00 MBs, Time 0.011000s, 11 Ticks, 363.636364MB/sImgCpy_RunImgCpy: Instances: 512x512, Iterations: 32, Copied 8.00 MBs, Time 0.024000s, 24 Ticks, 333.333333MB/sImgCpy_RunImgCpy: Instances: 512x512, Iterations: 64, Copied 16.00 MBs, Time 0.045000s, 45 Ticks, 355.555556MB/sImgCpy_RunImgCpy: Instances: 512x512, Iterations: 128, Copied 32.00 MBs, Time 0.090000s, 90 Ticks, 355.555556MB/sImgCpy_RunImgCpy: Instances: 512x512, Iterations: 256, Copied 64.00 MBs, Time 0.180000s, 180 Ticks, 355.555556MB/sImgCpy_RunImgCpy: Instances: 512x512, Iterations: 512, Copied 128.00 MBs, Time 0.359000s, 359 Ticks, 356.545961MB/sImgCpy_RunImgCpy: Instances: 512x512, Iterations: 1024, Copied 256.00 MBs, Time 0.717000s, 717 Ticks, 357.043236MB/sVerify_ImgCopyKernel: Average speed 453.33 MB/simgcpy --> passedimgcpy: Test took 3.10 seconds to run:imgcpy: Init 0.16s (5.22%) Verify 2.94s (94.78)%13******************************************************************************Image mipmaps:Performs an image direct copy with mipmaps.********************************************************************************Verify_MipMap: Average speed 0.00 MB/smipmap --> passedmipmap: Test took 0.10 seconds to run:mipmap: Init 0.01s (10.00%) Verify 0.09s (90.00)%14******************************************************************************Floating Point Operations Kernel (Scalar):Performs an online compilation of floating point add/mul/mad/divkernels which each perform a large number of one specific operation,and calculates the floating point operations per second of thedevice. This test uses (scalar) float as its unit of computation.********************************************************************************Compiling Add kernel...Compiling Mul kernel...Compiling Mad kernel...Compiling Div kernel...Compute_Float: Online compilation test with 524288 instances, each with 300 iterationsRunning Add kernel...Running Mul kernel...Running Mad kernel...Running Div kernel...---------------Float Add:---------------Verify_Float: Time 0.410148sVerify_Float: Iterations 300Verify_Float: Instances 524288Verify_Float: Floating point operations 15417212928Verify_Float: GFlop 15.417213Verify_Float: GFLOP/S 37.589390---------------Float Mul:---------------Verify_Float: Time 0.410114sVerify_Float: Iterations 300Verify_Float: Instances 524288Verify_Float: Floating point operations 15416164352Verify_Float: GFlop 15.416164Verify_Float: GFLOP/S 37.589949---------------Float Mad:---------------Verify_Float: Time 0.410401sVerify_Float: Iterations 300Verify_Float: Instances 524288Verify_Float: Floating point operations 30830231552Verify_Float: GFlop 30.830232Verify_Float: GFLOP/S 75.122214---------------Float Div:---------------Verify_Float: Time 0.820315sVerify_Float: Iterations 300Verify_Float: Instances 524288Verify_Float: Floating point operations 15416164352Verify_Float: GFlop 15.416164Verify_Float: GFLOP/S 18.792981---------------Verify_Float: All tests time 2.175840s---------------float --> passedfloat: Test took 5.82 seconds to run:float: Init 2.16s (37.04%) Compute 2.28s (39.10%)Verify 1.39s (23.87)%15******************************************************************************Floating Point Operations Kernel (Scalar):Performs an online compilation of floating point add/mul/mad/divkernels which each perform a large number of one specific operation,and calculates the floating point operations per second of thedevice. This test uses (scalar) float as its unit of computation.Kernels run in parallel without synchronization on host side.********************************************************************************Compiling Add kernel...Compiling Mul kernel...Compiling Mad kernel...Compiling Div kernel...Compute_Float_Parallel: Online compilation test with 524288 instances, each with 300 iterationsRunning Add kernel...Running Mul kernel...Running Mad kernel...Running Div kernel...---------------Float Add:---------------Verify_Float: Time 0.410296sVerify_Float: Iterations 300Verify_Float: Instances 524288Verify_Float: Floating point operations 15417212928Verify_Float: GFlop 15.417213Verify_Float: GFLOP/S 37.575830---------------Float Mul:---------------Verify_Float: Time 0.409929sVerify_Float: Iterations 300Verify_Float: Instances 524288Verify_Float: Floating point operations 15416164352Verify_Float: GFlop 15.416164Verify_Float: GFLOP/S 37.606913---------------Float Mad:---------------Verify_Float: Time 0.409919sVerify_Float: Iterations 300Verify_Float: Instances 524288Verify_Float: Floating point operations 30830231552Verify_Float: GFlop 30.830232Verify_Float: GFLOP/S 75.210545---------------Float Div:---------------Verify_Float: Time 0.820163sVerify_Float: Iterations 300Verify_Float: Instances 524288Verify_Float: Floating point operations 15416164352Verify_Float: GFlop 15.416164Verify_Float: GFLOP/S 18.796464---------------Verify_Float: All tests time 2.193818s---------------float_parallel --> passedfloat_parallel: Test took 4.07 seconds to run:float_parallel: Init 0.22s (5.48%) Compute 2.29s (56.37%)Verify 1.55s (38.15)%16******************************************************************************Floating Point Operations Kernel (Scalar):Performs an online compilation of floating point add/mul/mad/divkernels which each perform a large number of one specific operation,and calculates the floating point operations per second of thedevice. This test uses (scalar) float as its unit of computation.Kernels run in order of their dependencies created by events.********************************************************************************Compiling Add kernel...Compiling Mul kernel...Compiling Mad kernel...Compiling Div kernel...Compute_Float_Chain: Online compilation test with 524288 instances, each with 300 iterationsRunning Add kernel...Running Mul kernel...Running Mad kernel...Running Div kernel...---------------Float Add:---------------Verify_Float: Time 0.409995sVerify_Float: Iterations 300Verify_Float: Instances 524288Verify_Float: Floating point operations 15417212928Verify_Float: GFlop 15.417213Verify_Float: GFLOP/S 37.603417---------------Float Mul:---------------Verify_Float: Time 0.409922sVerify_Float: Iterations 300Verify_Float: Instances 524288Verify_Float: Floating point operations 15416164352Verify_Float: GFlop 15.416164Verify_Float: GFLOP/S 37.607555---------------Float Mad:---------------Verify_Float: Time 0.409930sVerify_Float: Iterations 300Verify_Float: Instances 524288Verify_Float: Floating point operations 30830231552Verify_Float: GFlop 30.830232Verify_Float: GFLOP/S 75.208527---------------Float Div:---------------Verify_Float: Time 0.820185sVerify_Float: Iterations 300Verify_Float: Instances 524288Verify_Float: Floating point operations 15416164352Verify_Float: GFlop 15.416164Verify_Float: GFLOP/S 18.795960---------------Verify_Float: All tests time 2.173687s---------------float_chain --> passedfloat_chain: Test took 4.19 seconds to run:float_chain: Init 0.14s (3.39%) Compute 2.25s (53.81%)Verify 1.79s (42.80)%17******************************************************************************Floating Point Operations Kernels for all vector sizes:Performs an online compilation of floating point add/mul/mad/divkernels which each perform a large number of one specific operation,and calculates the floating point operations per second of thedevice. This test uses float{1,2,3,4,8,16} as its unit of computation.********************************************************************************Compiling Add float kernel...Compiling Mul float kernel...Compiling Mad float kernel...Compiling Div float kernel...Init_Floatops: Reducing from 65536 to 65536 due to use of float in kernel.Compiling Add float2 kernel...Compiling Mul float2 kernel...Compiling Mad float2 kernel...Compiling Div float2 kernel...Init_Floatops: Reducing from 65536 to 65536 due to use of float2 in kernel.Compiling Add float3 kernel...Compiling Mul float3 kernel...Compiling Mad float3 kernel...Compiling Div float3 kernel...Init_Floatops: Reducing from 65536 to 65536 due to use of float3 in kernel.Compiling Add float4 kernel...Compiling Mul float4 kernel...Compiling Mad float4 kernel...Compiling Div float4 kernel...Init_Floatops: Reducing from 65536 to 65536 due to use of float4 in kernel.Compiling Add float8 kernel...Compiling Mul float8 kernel...Compiling Mad float8 kernel...Compiling Div float8 kernel...Init_Floatops: Reducing from 65536 to 65536 due to use of float8 in kernel.Compiling Add float16 kernel...Compiling Mul float16 kernel...Compiling Mad float16 kernel...Compiling Div float16 kernel...Init_Floatops: Reducing from 65536 to 65536 due to use of float16 in kernel.Compute_Floatops: Online compilation test with float using 65536 instances, each with 400 iterationsRunning Add kernel for float...Running Mul kernel for float...Running Mad kernel for float...Running Div kernel for float...Compute_Floatops: Online compilation test with float2 using 65536 instances, each with 400 iterationsRunning Add kernel for float2...Running Mul kernel for float2...Running Mad kernel for float2...Running Div kernel for float2...Compute_Floatops: Online compilation test with float3 using 65536 instances, each with 400 iterationsRunning Add kernel for float3...Running Mul kernel for float3...Running Mad kernel for float3...Running Div kernel for float3...Compute_Floatops: Online compilation test with float4 using 65536 instances, each with 400 iterationsRunning Add kernel for float4...Running Mul kernel for float4...Running Mad kernel for float4...Running Div kernel for float4...Compute_Floatops: Online compilation test with float8 using 65536 instances, each with 400 iterationsRunning Add kernel for float8...Running Mul kernel for float8...Running Mad kernel for float8...Running Div kernel for float8...Compute_Floatops: Online compilation test with float16 using 65536 instances, each with 400 iterationsRunning Add kernel for float16...Running Mul kernel for float16...Running Mad kernel for float16...Running Div kernel for float16...---------------float Add:---------------Verify_Floatops: Time 0.068629sVerify_Floatops: Iterations 400Verify_Floatops: Instances 65536Verify_Floatops: Float operations 2569404416Verify_Floatops: GFLOP 2.569404Verify_Floatops: GFLOP/S 37.439048---------------float Mul:---------------Verify_Floatops: Time 0.068905sVerify_Floatops: Iterations 400Verify_Floatops: Instances 65536Verify_Floatops: Float operations 2569273344Verify_Floatops: GFLOP 2.569273Verify_Floatops: GFLOP/S 37.287183---------------float Mad:---------------Verify_Floatops: Time 0.070338sVerify_Floatops: Iterations 400Verify_Floatops: Instances 65536Verify_Floatops: Float operations 5138284544Verify_Floatops: GFLOP 5.138285Verify_Floatops: GFLOP/S 73.051331---------------float Div:---------------Verify_Floatops: Time 0.136826sVerify_Floatops: Iterations 400Verify_Floatops: Instances 65536Verify_Floatops: Float operations 2569273344Verify_Floatops: GFLOP 2.569273Verify_Floatops: GFLOP/S 18.777669---------------float2 Add:---------------Verify_Floatops: Time 0.135482sVerify_Floatops: Iterations 400Verify_Floatops: Instances 65536Verify_Floatops: Float operations 5138808832Verify_Floatops: GFLOP 5.138809Verify_Floatops: GFLOP/S 37.929827---------------float2 Mul:---------------Verify_Floatops: Time 0.135580sVerify_Floatops: Iterations 400Verify_Floatops: Instances 65536Verify_Floatops: Float operations 5138546688Verify_Floatops: GFLOP 5.138547Verify_Floatops: GFLOP/S 37.900477---------------float2 Mad:---------------Verify_Floatops: Time 0.135984sVerify_Floatops: Iterations 400Verify_Floatops: Instances 65536Verify_Floatops: Float operations 10276569088Verify_Floatops: GFLOP 10.276569Verify_Floatops: GFLOP/S 75.571899---------------float2 Div:---------------Verify_Floatops: Time 0.272034sVerify_Floatops: Iterations 400Verify_Floatops: Instances 65536Verify_Floatops: Float operations 5138546688Verify_Floatops: GFLOP 5.138547Verify_Floatops: GFLOP/S 18.889355---------------float3 Add:---------------Verify_Floatops: Time 0.210914sVerify_Floatops: Iterations 400Verify_Floatops: Instances 65536Verify_Floatops: Float operations 7708213248Verify_Floatops: GFLOP 7.708213Verify_Floatops: GFLOP/S 36.546712---------------float3 Mul:---------------Verify_Floatops: Time 0.203259sVerify_Floatops: Iterations 400Verify_Floatops: Instances 65536Verify_Floatops: Float operations 7707820032Verify_Floatops: GFLOP 7.707820Verify_Floatops: GFLOP/S 37.921175---------------float3 Mad:---------------Verify_Floatops: Time 0.203220sVerify_Floatops: Iterations 400Verify_Floatops: Instances 65536Verify_Floatops: Float operations 15414853632Verify_Floatops: GFLOP 15.414854Verify_Floatops: GFLOP/S 75.853034---------------float3 Div:---------------Verify_Floatops: Time 0.408912sVerify_Floatops: Iterations 400Verify_Floatops: Instances 65536Verify_Floatops: Float operations 7707820032Verify_Floatops: GFLOP 7.707820Verify_Floatops: GFLOP/S 18.849581---------------float4 Add:---------------Verify_Floatops: Time 0.279024sVerify_Floatops: Iterations 400Verify_Floatops: Instances 65536Verify_Floatops: Float operations 10277617664Verify_Floatops: GFLOP 10.277618Verify_Floatops: GFLOP/S 36.834171---------------float4 Mul:---------------Verify_Floatops: Time 0.270386sVerify_Floatops: Iterations 400Verify_Floatops: Instances 65536Verify_Floatops: Float operations 10277093376Verify_Floatops: GFLOP 10.277093Verify_Floatops: GFLOP/S 38.008970---------------float4 Mad:---------------Verify_Floatops: Time 0.270518sVerify_Floatops: Iterations 400Verify_Floatops: Instances 65536Verify_Floatops: Float operations 20553138176Verify_Floatops: GFLOP 20.553138Verify_Floatops: GFLOP/S 75.976971---------------float4 Div:---------------Verify_Floatops: Time 0.543566sVerify_Floatops: Iterations 400Verify_Floatops: Instances 65536Verify_Floatops: Float operations 10277093376Verify_Floatops: GFLOP 10.277093Verify_Floatops: GFLOP/S 18.906799---------------float8 Add:---------------Verify_Floatops: Time 0.600531sVerify_Floatops: Iterations 400Verify_Floatops: Instances 65536Verify_Floatops: Float operations 20555235328Verify_Floatops: GFLOP 20.555235Verify_Floatops: GFLOP/S 34.228433---------------float8 Mul:---------------Verify_Floatops: Time 0.538501sVerify_Floatops: Iterations 400Verify_Floatops: Instances 65536Verify_Floatops: Float operations 20554186752Verify_Floatops: GFLOP 20.554187Verify_Floatops: GFLOP/S 38.169264---------------float8 Mad:---------------Verify_Floatops: Time 0.543278sVerify_Floatops: Iterations 400Verify_Floatops: Instances 65536Verify_Floatops: Float operations 41106276352Verify_Floatops: GFLOP 41.106276Verify_Floatops: GFLOP/S 75.663429---------------float8 Div:---------------Verify_Floatops: Time 1.091137sVerify_Floatops: Iterations 400Verify_Floatops: Instances 65536Verify_Floatops: Float operations 20554186752Verify_Floatops: GFLOP 20.554187Verify_Floatops: GFLOP/S 18.837402---------------float16 Add:---------------Verify_Floatops: Time 1.228492sVerify_Floatops: Iterations 400Verify_Floatops: Instances 65536Verify_Floatops: Float operations 41110470656Verify_Floatops: GFLOP 41.110471Verify_Floatops: GFLOP/S 33.464174---------------float16 Mul:---------------Verify_Floatops: Time 1.075268sVerify_Floatops: Iterations 400Verify_Floatops: Instances 65536Verify_Floatops: Float operations 41108373504Verify_Floatops: GFLOP 41.108374Verify_Floatops: GFLOP/S 38.230816---------------float16 Mad:---------------Verify_Floatops: Time 1.088152sVerify_Floatops: Iterations 400Verify_Floatops: Instances 65536Verify_Floatops: Float operations 82212552704Verify_Floatops: GFLOP 82.212553Verify_Floatops: GFLOP/S 75.552453---------------float16 Div:---------------Verify_Floatops: Time 2.190075sVerify_Floatops: Iterations 400Verify_Floatops: Instances 65536Verify_Floatops: Float operations 41108373504Verify_Floatops: GFLOP 41.108374Verify_Floatops: GFLOP/S 18.770304floatvec --> passedfloatvec: Test took 42.24 seconds to run:floatvec: Init 19.01s (44.99%) Compute 13.70s (32.44%)Verify 9.53s (22.56)%18******************************************************************************Integer Operations Kernels:Performs an online compilation of integer add/mul/mad/divkernels which each perform a large number of one specificoperation, and calculates the operations per second of thedevice. This test uses int{1,2,3,4} as its unit of computation.********************************************************************************Compiling Add int kernel...Compiling Mul int kernel...Compiling Mad int kernel...Compiling Div int kernel...Compiling Add int2 kernel...Compiling Mul int2 kernel...Compiling Mad int2 kernel...Compiling Div int2 kernel...Compiling Add int3 kernel...Compiling Mul int3 kernel...Compiling Mad int3 kernel...Compiling Div int3 kernel...Compiling Add int4 kernel...Compiling Mul int4 kernel...Compiling Mad int4 kernel...Compiling Div int4 kernel...Compute_Int: Online compilation test with int using 65536 instances, each with 400 iterationsRunning Add kernel for int...Running Mul kernel for int...Running Mad kernel for int...Running Div kernel for int...Compute_Int: Online compilation test with int2 using 65536 instances, each with 400 iterationsRunning Add kernel for int2...Running Mul kernel for int2...Running Mad kernel for int2...Running Div kernel for int2...Compute_Int: Online compilation test with int3 using 65536 instances, each with 400 iterationsRunning Add kernel for int3...Running Mul kernel for int3...Running Mad kernel for int3...Running Div kernel for int3...Compute_Int: Online compilation test with int4 using 65536 instances, each with 400 iterationsRunning Add kernel for int4...Running Mul kernel for int4...Running Mad kernel for int4...Running Div kernel for int4...---------------int Add:---------------Verify_Int: Time 0.048407sVerify_Int: Iterations 400Verify_Int: Instances 65536Verify_Int: Integer operations 996540416Verify_Int: GIOp 0.996540Verify_Int: GIOP/S 20.586701---------------int Mul:---------------Verify_Int: Time 0.044155sVerify_Int: Iterations 400Verify_Int: Instances 65536Verify_Int: Integer operations 996409344Verify_Int: GIOp 0.996409Verify_Int: GIOP/S 22.566172---------------int Mad:---------------Verify_Int: Time 0.056313sVerify_Int: Iterations 400Verify_Int: Instances 65536Verify_Int: Integer operations 2097414144Verify_Int: GIOp 2.097414Verify_Int: GIOP/S 37.245647---------------int Div:---------------Verify_Int: Time 3.138706sVerify_Int: Iterations 400Verify_Int: Instances 65536Verify_Int: Integer operations 996409344Verify_Int: GIOp 0.996409Verify_Int: GIOP/S 0.317459---------------int2 Add:---------------Verify_Int: Time 0.094852sVerify_Int: Iterations 400Verify_Int: Instances 65536Verify_Int: Integer operations 1993080832Verify_Int: GIOp 1.993081Verify_Int: GIOP/S 21.012534---------------int2 Mul:---------------Verify_Int: Time 0.086736sVerify_Int: Iterations 400Verify_Int: Instances 65536Verify_Int: Integer operations 1992818688Verify_Int: GIOp 1.992819Verify_Int: GIOP/S 22.975681---------------int2 Mad:---------------Verify_Int: Time 0.110893sVerify_Int: Iterations 400Verify_Int: Instances 65536Verify_Int: Integer operations 4194828288Verify_Int: GIOp 4.194828Verify_Int: GIOP/S 37.827710---------------int2 Div:---------------Verify_Int: Time 6.437046sVerify_Int: Iterations 400Verify_Int: Instances 65536Verify_Int: Integer operations 1992818688Verify_Int: GIOp 1.992819Verify_Int: GIOP/S 0.309586---------------int3 Add:---------------Verify_Int: Time 0.141097sVerify_Int: Iterations 400Verify_Int: Instances 65536Verify_Int: Integer operations 2989621248Verify_Int: GIOp 2.989621Verify_Int: GIOP/S 21.188411---------------int3 Mul:---------------Verify_Int: Time 0.128773sVerify_Int: Iterations 400Verify_Int: Instances 65536Verify_Int: Integer operations 2989228032Verify_Int: GIOp 2.989228Verify_Int: GIOP/S 23.213158---------------int3 Mad:---------------Verify_Int: Time 0.166180sVerify_Int: Iterations 400Verify_Int: Instances 65536Verify_Int: Integer operations 6292242432Verify_Int: GIOp 6.292242Verify_Int: GIOP/S 37.864018---------------int3 Div:---------------Verify_Int: Time 9.671746sVerify_Int: Iterations 400Verify_Int: Instances 65536Verify_Int: Integer operations 2989228032Verify_Int: GIOp 2.989228Verify_Int: GIOP/S 0.309068---------------int4 Add:---------------Verify_Int: Time 0.187724sVerify_Int: Iterations 400Verify_Int: Instances 65536Verify_Int: Integer operations 3986161664Verify_Int: GIOp 3.986162Verify_Int: GIOP/S 21.234161---------------int4 Mul:---------------Verify_Int: Time 0.171712sVerify_Int: Iterations 400Verify_Int: Instances 65536Verify_Int: Integer operations 3985637376Verify_Int: GIOp 3.985637Verify_Int: GIOP/S 23.211176---------------int4 Mad:---------------Verify_Int: Time 0.220862sVerify_Int: Iterations 400Verify_Int: Instances 65536Verify_Int: Integer operations 8389656576Verify_Int: GIOp 8.389657Verify_Int: GIOP/S 37.985967---------------int4 Div:---------------Verify_Int: Time 12.905239sVerify_Int: Iterations 400Verify_Int: Instances 65536Verify_Int: Integer operations 3985637376Verify_Int: GIOp 3.985637Verify_Int: GIOP/S 0.308839int --> passedint: Test took 76.98 seconds to run:int: Init 37.45s (48.65%) Compute 34.50s (44.81%)Verify 5.03s (6.53)%19******************************************************************************Integer Operations Kernels:Performs an online compilation of integer modulokernels which each perform a large number of one specificoperation, and calculates the operations per second of thedevice. This test uses int{1,2,3,4} as its unit of computation.********************************************************************************Compiling Div uint kernel...Compiling Div Fast uint kernel...Compiling Div uint2 kernel...Compiling Div Fast uint2 kernel...Compiling Div uint3 kernel...Compiling Div Fast uint3 kernel...Compiling Div uint4 kernel...Compiling Div Fast uint4 kernel...Compute_IntMod: Online compilation test with uint using 65536 instances, each with 400 iterationsRunning Div kernel for uint...Running Div kernel Fast for uint...Compute_IntMod: Online compilation test with uint2 using 65536 instances, each with 400 iterationsRunning Div kernel for uint2...Running Div kernel Fast for uint2...Compute_IntMod: Online compilation test with uint3 using 65536 instances, each with 400 iterationsRunning Div kernel for uint3...Running Div kernel Fast for uint3...Compute_IntMod: Online compilation test with uint4 using 65536 instances, each with 400 iterationsRunning Div kernel for uint4...Running Div kernel Fast for uint4...---------------uint Div:---------------Verify_IntMod: Time 0.206417sVerify_IntMod: Iterations 400Verify_IntMod: Instances 65536Verify_IntMod: Integer operations 1520697344Verify_IntMod: GIOp 1.520697Verify_IntMod: GIOP/S 7.367113---------------uint Div fast:---------------Verify_IntMod: Time 0.206099sVerify_IntMod: Iterations 400Verify_IntMod: Instances 65536Verify_IntMod: Integer operations 1520697344Verify_IntMod: GIOp 1.520697Verify_IntMod: GIOP/S 7.378480---------------uint2 Div:---------------Verify_IntMod: Time 0.462457sVerify_IntMod: Iterations 400Verify_IntMod: Instances 65536Verify_IntMod: Integer operations 3041394688Verify_IntMod: GIOp 3.041395Verify_IntMod: GIOP/S 6.576600---------------uint2 Div fast:---------------Verify_IntMod: Time 0.458995sVerify_IntMod: Iterations 400Verify_IntMod: Instances 65536Verify_IntMod: Integer operations 3041394688Verify_IntMod: GIOp 3.041395Verify_IntMod: GIOP/S 6.626204---------------uint3 Div:---------------Verify_IntMod: Time 0.692772sVerify_IntMod: Iterations 400Verify_IntMod: Instances 65536Verify_IntMod: Integer operations 4562092032Verify_IntMod: GIOp 4.562092Verify_IntMod: GIOP/S 6.585272---------------uint3 Div fast:---------------Verify_IntMod: Time 0.690712sVerify_IntMod: Iterations 400Verify_IntMod: Instances 65536Verify_IntMod: Integer operations 4562092032Verify_IntMod: GIOp 4.562092Verify_IntMod: GIOP/S 6.604912---------------uint4 Div:---------------Verify_IntMod: Time 0.938431sVerify_IntMod: Iterations 400Verify_IntMod: Instances 65536Verify_IntMod: Integer operations 6082789376Verify_IntMod: GIOp 6.082789Verify_IntMod: GIOP/S 6.481872---------------uint4 Div fast:---------------Verify_IntMod: Time 0.949093sVerify_IntMod: Iterations 400Verify_IntMod: Instances 65536Verify_IntMod: Integer operations 6082789376Verify_IntMod: GIOp 6.082789Verify_IntMod: GIOP/S 6.409055intmod --> passedintmod: Test took 24.13 seconds to run:intmod: Init 16.80s (69.61%) Compute 4.97s (20.58%)Verify 2.37s (9.81)%20******************************************************************************Short Operations Kernels:Performs an online compilation of short add/mul/mad/divkernels which each perform a large number of one specificoperation, and calculates the operations per second of thedevice. This test uses short{1,2,3,4} as its unit of computation.********************************************************************************Compiling Add short kernel...Compiling Mul short kernel...Compiling Mad short kernel...Compiling Div short kernel...Compiling Add short2 kernel...Compiling Mul short2 kernel...Compiling Mad short2 kernel...Compiling Div short2 kernel...Compiling Add short3 kernel...Compiling Mul short3 kernel...Compiling Mad short3 kernel...Compiling Div short3 kernel...Compiling Add short4 kernel...Compiling Mul short4 kernel...Compiling Mad short4 kernel...Compiling Div short4 kernel...Compute_Short: Online compilation test with short using 65536 instances, each with 400 iterationsRunning Add kernel for short...Running Mul kernel for short...Running Mad kernel for short...Running Div kernel for short...Compute_Short: Online compilation test with short2 using 65536 instances, each with 400 iterationsRunning Add kernel for short2...Running Mul kernel for short2...Running Mad kernel for short2...Running Div kernel for short2...Compute_Short: Online compilation test with short3 using 65536 instances, each with 400 iterationsRunning Add kernel for short3...Running Mul kernel for short3...Running Mad kernel for short3...Running Div kernel for short3...Compute_Short: Online compilation test with short4 using 65536 instances, each with 400 iterationsRunning Add kernel for short4...Running Mul kernel for short4...Running Mad kernel for short4...Running Div kernel for short4...---------------short Add:---------------Verify_Short: Time 0.048165sVerify_Short: Iterations 400Verify_Short: Instances 65536Verify_Short: Short operations 996540416Verify_Short: GIOp 0.996540Verify_Short: GIOP/S 20.690136---------------short Mul:---------------Verify_Short: Time 0.044276sVerify_Short: Iterations 400Verify_Short: Instances 65536Verify_Short: Short operations 996409344Verify_Short: GIOp 0.996409Verify_Short: GIOP/S 22.504502---------------short Mad:---------------Verify_Short: Time 0.056554sVerify_Short: Iterations 400Verify_Short: Instances 65536Verify_Short: Short operations 2097414144Verify_Short: GIOp 2.097414Verify_Short: GIOP/S 37.086928---------------short Div:---------------Verify_Short: Time 0.261235sVerify_Short: Iterations 400Verify_Short: Instances 65536Verify_Short: Short operations 996409344Verify_Short: GIOp 0.996409Verify_Short: GIOP/S 3.814226---------------short2 Add:---------------Verify_Short: Time 0.095012sVerify_Short: Iterations 400Verify_Short: Instances 65536Verify_Short: Short operations 1993080832Verify_Short: GIOp 1.993081Verify_Short: GIOP/S 20.977148---------------short2 Mul:---------------Verify_Short: Time 0.086786sVerify_Short: Iterations 400Verify_Short: Instances 65536Verify_Short: Short operations 1992818688Verify_Short: GIOp 1.992819Verify_Short: GIOP/S 22.962444---------------short2 Mad:---------------Verify_Short: Time 0.111290sVerify_Short: Iterations 400Verify_Short: Instances 65536Verify_Short: Short operations 4194828288Verify_Short: GIOp 4.194828Verify_Short: GIOP/S 37.692769---------------short2 Div:---------------Verify_Short: Time 0.418234sVerify_Short: Iterations 400Verify_Short: Instances 65536Verify_Short: Short operations 1992818688Verify_Short: GIOp 1.992819Verify_Short: GIOP/S 4.764841---------------short3 Add:---------------Verify_Short: Time 0.141091sVerify_Short: Iterations 400Verify_Short: Instances 65536Verify_Short: Short operations 2989621248Verify_Short: GIOp 2.989621Verify_Short: GIOP/S 21.189312---------------short3 Mul:---------------Verify_Short: Time 0.129238sVerify_Short: Iterations 400Verify_Short: Instances 65536Verify_Short: Short operations 2989228032Verify_Short: GIOp 2.989228Verify_Short: GIOP/S 23.129637---------------short3 Mad:---------------Verify_Short: Time 0.165615sVerify_Short: Iterations 400Verify_Short: Instances 65536Verify_Short: Short operations 6292242432Verify_Short: GIOp 6.292242Verify_Short: GIOP/S 37.993192---------------short3 Div:---------------Verify_Short: Time 0.626321sVerify_Short: Iterations 400Verify_Short: Instances 65536Verify_Short: Short operations 2989228032Verify_Short: GIOp 2.989228Verify_Short: GIOP/S 4.772677---------------short4 Add:---------------Verify_Short: Time 0.188247sVerify_Short: Iterations 400Verify_Short: Instances 65536Verify_Short: Short operations 3986161664Verify_Short: GIOp 3.986162Verify_Short: GIOP/S 21.175167---------------short4 Mul:---------------Verify_Short: Time 0.171203sVerify_Short: Iterations 400Verify_Short: Instances 65536Verify_Short: Short operations 3985637376Verify_Short: GIOp 3.985637Verify_Short: GIOP/S 23.280184---------------short4 Mad:---------------Verify_Short: Time 0.220319sVerify_Short: Iterations 400Verify_Short: Instances 65536Verify_Short: Short operations 8389656576Verify_Short: GIOp 8.389657Verify_Short: GIOP/S 38.079587---------------short4 Div:---------------Verify_Short: Time 0.834570sVerify_Short: Iterations 400Verify_Short: Instances 65536Verify_Short: Short operations 3985637376Verify_Short: GIOp 3.985637Verify_Short: GIOP/S 4.775678short --> passedshort: Test took 19.14 seconds to run:short: Init 9.86s (51.52%) Compute 4.28s (22.38%)Verify 5.00s (26.11)%21******************************************************************************Short Operations Kernels:Performs an online compilation of short modulokernels which each perform a large number of one specificoperation, and calculates the operations per second of thedevice. This test uses short{1,2,3,4} as its unit of computation.********************************************************************************Compiling Div ushort kernel...Compiling Div Fast ushort kernel...Compiling Div ushort2 kernel...Compiling Div Fast ushort2 kernel...Compiling Div ushort3 kernel...Compiling Div Fast ushort3 kernel...Compiling Div ushort4 kernel...Compiling Div Fast ushort4 kernel...Compute_ShortMod: Online compilation test with ushort using 65536 instances, each with 400 iterationsRunning Div kernel for ushort...Running Div kernel Fast for ushort...Compute_ShortMod: Online compilation test with ushort2 using 65536 instances, each with 400 iterationsRunning Div kernel for ushort2...Running Div kernel Fast for ushort2...Compute_ShortMod: Online compilation test with ushort3 using 65536 instances, each with 400 iterationsRunning Div kernel for ushort3...Running Div kernel Fast for ushort3...Compute_ShortMod: Online compilation test with ushort4 using 65536 instances, each with 400 iterationsRunning Div kernel for ushort4...Running Div kernel Fast for ushort4...---------------ushort Div:---------------Verify_ShortMod: Time 0.482448sVerify_ShortMod: Iterations 400Verify_ShortMod: Instances 65536Verify_ShortMod: Integer operations 1520697344Verify_ShortMod: GIOp 1.520697Verify_ShortMod: GIOP/S 3.152044---------------ushort Div fast:---------------Verify_ShortMod: Time 0.242209sVerify_ShortMod: Iterations 400Verify_ShortMod: Instances 65536Verify_ShortMod: Integer operations 1520697344Verify_ShortMod: GIOp 1.520697Verify_ShortMod: GIOP/S 6.278451---------------ushort2 Div:---------------Verify_ShortMod: Time 0.936468sVerify_ShortMod: Iterations 400Verify_ShortMod: Instances 65536Verify_ShortMod: Integer operations 3041394688Verify_ShortMod: GIOp 3.041395Verify_ShortMod: GIOP/S 3.247729---------------ushort2 Div fast:---------------Verify_ShortMod: Time 0.481168sVerify_ShortMod: Iterations 400Verify_ShortMod: Instances 65536Verify_ShortMod: Integer operations 3041394688Verify_ShortMod: GIOp 3.041395Verify_ShortMod: GIOP/S 6.320858---------------ushort3 Div:---------------Verify_ShortMod: Time 1.518535sVerify_ShortMod: Iterations 400Verify_ShortMod: Instances 65536Verify_ShortMod: Integer operations 4562092032Verify_ShortMod: GIOp 4.562092Verify_ShortMod: GIOP/S 3.004272---------------ushort3 Div fast:---------------Verify_ShortMod: Time 0.731963sVerify_ShortMod: Iterations 400Verify_ShortMod: Instances 65536Verify_ShortMod: Integer operations 4562092032Verify_ShortMod: GIOp 4.562092Verify_ShortMod: GIOP/S 6.232681---------------ushort4 Div:---------------Verify_ShortMod: Time 2.163825sVerify_ShortMod: Iterations 400Verify_ShortMod: Instances 65536Verify_ShortMod: Integer operations 6082789376Verify_ShortMod: GIOp 6.082789Verify_ShortMod: GIOP/S 2.811128---------------ushort4 Div fast:---------------Verify_ShortMod: Time 1.011028sVerify_ShortMod: Iterations 400Verify_ShortMod: Instances 65536Verify_ShortMod: Integer operations 6082789376Verify_ShortMod: GIOp 6.082789Verify_ShortMod: GIOP/S 6.016440shortmod --> passedshortmod: Test took 26.41 seconds to run:shortmod: Init 15.94s (60.36%) Compute 7.95s (30.11%)Verify 2.52s (9.53)%22******************************************************************************Char Operations Kernels:Performs an online compilation of char add/mul/mad/divkernels which each perform a large number of one specificoperation, and calculates the operations per second of thedevice. This test uses char{1,2,3,4} as its unit of computation.********************************************************************************Compiling Add char kernel...Compiling Mul char kernel...Compiling Mad char kernel...Compiling Div char kernel...Compiling Add char2 kernel...Compiling Mul char2 kernel...Compiling Mad char2 kernel...Compiling Div char2 kernel...Compiling Add char3 kernel...Compiling Mul char3 kernel...Compiling Mad char3 kernel...Compiling Div char3 kernel...Compiling Add char4 kernel...Compiling Mul char4 kernel...Compiling Mad char4 kernel...Compiling Div char4 kernel...Compute_Char: Online compilation test with char using 65536 instances, each with 400 iterationsRunning Add kernel for char...Running Mul kernel for char...Running Mad kernel for char...Running Div kernel for char...Compute_Char: Online compilation test with char2 using 65536 instances, each with 400 iterationsRunning Add kernel for char2...Running Mul kernel for char2...Running Mad kernel for char2...Running Div kernel for char2...Compute_Char: Online compilation test with char3 using 65536 instances, each with 400 iterationsRunning Add kernel for char3...Running Mul kernel for char3...Running Mad kernel for char3...Running Div kernel for char3...Compute_Char: Online compilation test with char4 using 65536 instances, each with 400 iterationsRunning Add kernel for char4...Running Mul kernel for char4...Running Mad kernel for char4...Running Div kernel for char4...---------------char Add:---------------Verify_Char: Time 0.048132sVerify_Char: Iterations 400Verify_Char: Instances 65536Verify_Char: Char operations 996540416Verify_Char: GIOp 0.996540Verify_Char: GIOP/S 20.704322---------------char Mul:---------------Verify_Char: Time 0.031758sVerify_Char: Iterations 400Verify_Char: Instances 65536Verify_Char: Char operations 681836544Verify_Char: GIOp 0.681837Verify_Char: GIOP/S 21.469757---------------char Mad:---------------Verify_Char: Time 0.056299sVerify_Char: Iterations 400Verify_Char: Instances 65536Verify_Char: Char operations 2097414144Verify_Char: GIOp 2.097414Verify_Char: GIOP/S 37.254909---------------char Div:---------------Verify_Char: Time 0.261175sVerify_Char: Iterations 400Verify_Char: Instances 65536Verify_Char: Char operations 996409344Verify_Char: GIOp 0.996409Verify_Char: GIOP/S 3.815102---------------char2 Add:---------------Verify_Char: Time 0.095022sVerify_Char: Iterations 400Verify_Char: Instances 65536Verify_Char: Char operations 1993080832Verify_Char: GIOp 1.993081Verify_Char: GIOP/S 20.974941---------------char2 Mul:---------------Verify_Char: Time 0.061869sVerify_Char: Iterations 400Verify_Char: Instances 65536Verify_Char: Char operations 1363673088Verify_Char: GIOp 1.363673Verify_Char: GIOP/S 22.041298---------------char2 Mad:---------------Verify_Char: Time 0.110911sVerify_Char: Iterations 400Verify_Char: Instances 65536Verify_Char: Char operations 4194828288Verify_Char: GIOp 4.194828Verify_Char: GIOP/S 37.821571---------------char2 Div:---------------Verify_Char: Time 0.519456sVerify_Char: Iterations 400Verify_Char: Instances 65536Verify_Char: Char operations 1992818688Verify_Char: GIOp 1.992819Verify_Char: GIOP/S 3.836357---------------char3 Add:---------------Verify_Char: Time 0.141104sVerify_Char: Iterations 400Verify_Char: Instances 65536Verify_Char: Char operations 2989621248Verify_Char: GIOp 2.989621Verify_Char: GIOP/S 21.187360---------------char3 Mul:---------------Verify_Char: Time 0.091806sVerify_Char: Iterations 400Verify_Char: Instances 65536Verify_Char: Char operations 2045509632Verify_Char: GIOp 2.045510Verify_Char: GIOP/S 22.280784---------------char3 Mad:---------------Verify_Char: Time 0.165578sVerify_Char: Iterations 400Verify_Char: Instances 65536Verify_Char: Char operations 6292242432Verify_Char: GIOp 6.292242Verify_Char: GIOP/S 38.001682---------------char3 Div:---------------Verify_Char: Time 0.627681sVerify_Char: Iterations 400Verify_Char: Instances 65536Verify_Char: Char operations 2989228032Verify_Char: GIOp 2.989228Verify_Char: GIOP/S 4.762336---------------char4 Add:---------------Verify_Char: Time 0.188343sVerify_Char: Iterations 400Verify_Char: Instances 65536Verify_Char: Char operations 3986161664Verify_Char: GIOp 3.986162Verify_Char: GIOP/S 21.164374---------------char4 Mul:---------------Verify_Char: Time 0.121852sVerify_Char: Iterations 400Verify_Char: Instances 65536Verify_Char: Char operations 2727346176Verify_Char: GIOp 2.727346Verify_Char: GIOP/S 22.382449---------------char4 Mad:---------------Verify_Char: Time 0.220987sVerify_Char: Iterations 400Verify_Char: Instances 65536Verify_Char: Char operations 8389656576Verify_Char: GIOp 8.389657Verify_Char: GIOP/S 37.964480---------------char4 Div:---------------Verify_Char: Time 0.835537sVerify_Char: Iterations 400Verify_Char: Instances 65536Verify_Char: Char operations 3985637376Verify_Char: GIOp 3.985637Verify_Char: GIOP/S 4.770151char --> passedchar: Test took 19.13 seconds to run:char: Init 9.78s (51.11%) Compute 4.30s (22.45%)Verify 5.06s (26.43)%23******************************************************************************Device Transfer Test:Performs purely transfer operations using an OpenCL buffer from: host --> device (Upload) device --> device (Copy) device --> host (Download)It reports the average speed of each operation on the device.********************************************************************************Init_Transfer: Random seed set to 1507279508 Compute_Transfer: Starting transfer operations ... , 4MB, 4MB, 4MB, 4MB, 5MB, 5MB, 6MB, 7MB, 8MB, 9MB, 10MB, 12MB, 16MB, 21MB, 32MB, 64MBCompute_Transfer: Transfer tests complete.Actual speedVerify_Transfer: Transfer Type | Size (MB) | Offset | Time (s) | MB/s Verify_Transfer: Upload | 4 | 02c0a3d | 0.004000 | 1000.00 Verify_Transfer: Copy | 4 | 01b1eb8 | 0.004000 | 1000.00 Verify_Transfer: Download | 4 | 01b4ccc | 0.031000 | 129.03 Verify_Transfer: Upload | 4 | 0273fb0 | 0.004000 | 1000.00 Verify_Transfer: Copy | 4 | 01154a4 | 0.004000 | 1000.00 Verify_Transfer: Download | 4 | 03502e7 | 0.035000 | 114.29 Verify_Transfer: Upload | 4 | 02d2a6c | 0.005000 | 800.00 Verify_Transfer: Copy | 4 | 026fc32 | 0.004000 | 1000.00 Verify_Transfer: Download | 4 | 011ba08 | 0.036000 | 111.11 Verify_Transfer: Upload | 4 | 03e5001 | 0.005000 | 800.00 Verify_Transfer: Copy | 4 | 03e3d64 | 0.004000 | 1000.00 Verify_Transfer: Download | 4 | 01eaf03 | 0.038000 | 105.26 Verify_Transfer: Upload | 5 | 00d6041 | 0.005000 | 1000.00 Verify_Transfer: Copy | 5 | 03b22d0 | 0.005000 | 1000.00 Verify_Transfer: Download | 5 | 015fafe | 0.044000 | 113.64 Verify_Transfer: Upload | 5 | 03f7897 | 0.006000 | 833.33 Verify_Transfer: Copy | 5 | 01b2a66 | 0.005000 | 1000.00 Verify_Transfer: Download | 5 | 00f7ca8 | 0.048000 | 104.17 Verify_Transfer: Upload | 6 | 03f7f8c | 0.006000 | 1000.00 Verify_Transfer: Copy | 6 | 04ba9d1 | 0.005000 | 1200.00 Verify_Transfer: Download | 6 | 0368f08 | 0.050000 | 120.00 Verify_Transfer: Upload | 7 | 022fe34 | 0.007000 | 1000.00 Verify_Transfer: Copy | 7 | 01dd5c6 | 0.006000 | 1166.67 Verify_Transfer: Download | 7 | 03f01f2 | 0.056000 | 125.00 Verify_Transfer: Upload | 8 | 03acccc | 0.008000 | 1000.00 Verify_Transfer: Copy | 8 | 018a3d7 | 0.007000 | 1142.86 Verify_Transfer: Download | 8 | 01a6e97 | 0.062000 | 129.03 Verify_Transfer: Upload | 9 | 015f15f | 0.008000 | 1125.00 Verify_Transfer: Copy | 9 | 069d3c7 | 0.008000 | 1125.00 Verify_Transfer: Download | 9 | 02f6585 | 0.072000 | 125.00 Verify_Transfer: Upload | 10 | 06f5c29 | 0.010000 | 1000.00 Verify_Transfer: Copy | 10 | 08468ad | 0.009000 | 1111.11 Verify_Transfer: Download | 10 | 019e26a | 0.086000 | 116.28 Verify_Transfer: Upload | 12 | 026fe71 | 0.011000 | 1090.91 Verify_Transfer: Copy | 12 | 0599ed7 | 0.011000 | 1090.91 Verify_Transfer: Download | 12 | 04a6223 | 0.109000 | 110.09 Verify_Transfer: Upload | 16 | 0510625 | 0.014000 | 1142.86 Verify_Transfer: Copy | 16 | 028b439 | 0.013000 | 1230.77 Verify_Transfer: Download | 16 | 048f5c2 | 0.129000 | 124.03 Verify_Transfer: Upload | 21 | 09ba5e3 | 0.019000 | 1105.26 Verify_Transfer: Copy | 21 | 00fb38a | 0.017000 | 1235.29 Verify_Transfer: Download | 21 | 08607bc | 0.172000 | 122.09 Verify_Transfer: Upload | 32 | 0753f7d | 0.029000 | 1103.45 Verify_Transfer: Copy | 32 | 097ced9 | 0.026000 | 1230.77 Verify_Transfer: Download | 32 | 05851eb | 0.257000 | 124.51 Verify_Transfer: Upload | 64 | 0000000 | 0.057000 | 1122.81 Verify_Transfer: Copy | 64 | 0000000 | 0.049000 | 1306.12 Verify_Transfer: Download | 64 | 0000000 | 0.314000 | 203.82 Verify_Transfer: Average upload speed 1007.73 MB/sVerify_Transfer: Average copy speed 1114.97 MB/sVerify_Transfer: Average download speed 123.58 MB/sVerify_Transfer: Verifying data integrityVerify_Transfer: Data integrity OKtransfer --> passedtransfer: Test took 9.31 seconds to run:transfer: Init 4.12s (44.33%) Compute 2.59s (27.87%)Verify 2.59s (27.80)%24******************************************************************************Work-group Memory Copy Kernel:Performs an online compilation of a kernel which copies inputto output using work-groups, verifying the results.********************************************************************************Verify_MemcpyWorkgroup: Instances: 6528, Copied 0.024902 MBs, Time 0.000325s 76.622596MB/sVerify_MemcpyWorkgroup: Instances: 691200, Copied 2.636719 MBs, Time 0.004165s 633.065726MB/sVerify_MemcpyWorkgroup: Instances: 31360, Copied 0.119629 MBs, Time 0.000377s 317.318054MB/sVerify_MemcpyWorkgroup: Instances: 1344000, Copied 5.126953 MBs, Time 0.007608s 673.889738MB/smemcpy_workgroups --> passedmemcpy_workgroups: Test took 0.97 seconds to run:memcpy_workgroups: Init 0.14s (14.70%) Verify 0.82s (85.30)%25******************************************************************************Global-offset Memory Copy Kernel:Performs an online compilation of a kernel which copies inputto output using global offsets, verifying the results.********************************************************************************memcpy_global_offsets --> passedmemcpy_global_offsets: Test took 0.39 seconds to run:memcpy_global_offsets: Init 0.15s (38.68%) Verify 0.24s (61.32)%26******************************************************************************Image Convolution Test:Runs a number of image convolution kernels on an image********************************************************************************Failed to open source.bmp for readingRunning image copy, frame 0Running Sharpen kernel, frame 1Running Blur kernel, frame 2Running Gaussian kernel, frame 3Running Laplace Edge detection kernel, frame 4convolution --> passedconvolution: Test took 2.70 seconds to run:convolution: Init 1.96s (72.52%) Compute 0.29s (10.81%)Verify 0.45s (16.67)%27******************************************************************************Conversions Test:Tests conversions between various data types andverifies the results.********************************************************************************conversions --> passedconversions: Test took 20.44 seconds to run:conversions: Init 0.00s (0.04%) Verify 20.43s (99.95)%28******************************************************************************Events Test:Runs a mixture of kernels using events and not using eventsall of which perform a mem copy and verify the result********************************************************************************Verify_MulEvtKernel: Instances: 32, Copied 0 MBs, Time 0.000298s 0.000000MB/sVerify_MulEvtKernel: Instances: 32, Copied 0 MBs, Time 0.000160s 0.000000MB/sVerify_MulEvtKernel: Instances: 32, Copied 0 MBs, Time 0.000150s 0.000000MB/sVerify_MulEvtKernel: Instances: 32, Copied 0 MBs, Time 0.000151s 0.000000MB/sVerify_MulEvtKernel: Instances: 32, Copied 0 MBs, Time 0.000166s 0.000000MB/sVerify_MulEvtKernel: Instances: 32, Copied 0 MBs, Time 0.000145s 0.000000MB/sVerify_MulEvtKernel: Instances: 32, Copied 0 MBs, Time 0.000130s 0.000000MB/sVerify_MulEvtKernel: Instances: 32, Copied 0 MBs, Time 0.000140s 0.000000MB/sVerify_MulEvtKernel: Instances: 32, Copied 0 MBs, Time 0.000143s 0.000000MB/sVerify_MulEvtKernel: Instances: 32, Copied 0 MBs, Time 0.000153s 0.000000MB/sVerify_MulEvtKernel: Instances: 32, Copied 0 MBs, Time 0.000144s 0.000000MB/sVerify_MulEvtKernel: Instances: 32, Copied 0 MBs, Time 0.000175s 0.000000MB/sVerify_MulEvtKernel: Instances: 32, Copied 0 MBs, Time 0.000156s 0.000000MB/sVerify_MulEvtKernel: Instances: 32, Copied 0 MBs, Time 0.000148s 0.000000MB/sevents --> passedevents: Test took 0.78 seconds to run:events: Init 0.01s (2.42%) Verify 0.76s (97.57)%29******************************************************************************Atomics Test:Runs all atomic functions********************************************************************************atomics --> passedatomics: Test took 1.57 seconds to run:atomics: Init 0.00s (0.56%) Verify 1.57s (99.43)%30******************************************************************************NOP Test:Runs a kernel with no body effectively making it a NOP kernel********************************************************************************Compute_NOP: Online compilation test with 4096 instances running source:<source>__kernel void NOPKernel(){}</source>nop --> passednop: Test took 0.39 seconds to run:nop: Init 0.10s (25.76%) Compute 0.04s (10.45%)Verify 0.25s (63.77)%Finished 31 tests in 291.6 seconds: 31 passed, 0 failed (100.00%)
User:Uli/R-Car M3-W Mainline GPU Test (2024)

References

Top Articles
Latest Posts
Article information

Author: Allyn Kozey

Last Updated:

Views: 5877

Rating: 4.2 / 5 (43 voted)

Reviews: 82% of readers found this page helpful

Author information

Name: Allyn Kozey

Birthday: 1993-12-21

Address: Suite 454 40343 Larson Union, Port Melia, TX 16164

Phone: +2456904400762

Job: Investor Administrator

Hobby: Sketching, Puzzles, Pet, Mountaineering, Skydiving, Dowsing, Sports

Introduction: My name is Allyn Kozey, I am a outstanding, colorful, adventurous, encouraging, zealous, tender, helpful person who loves writing and wants to share my knowledge and understanding with you.