Miska Posted July 26, 2017 Share Posted July 26, 2017 5 hours ago, sbenyo said: Is it for sure that dual-gpu (e.g. gtx 1070, 1080) is not supported? At least not explicitly. Depending on how Nvidia implemented some of the CUDA functionality, multiple GPUs could possibly work when you have also convolution enabled. But since I don't have any multi-GPU machines I have not checked whether this happens or not. Signalyst - Developer of HQPlayer Pulse & Fidelity - Software Defined Amplifiers Link to comment
john925 Posted August 4, 2017 Share Posted August 4, 2017 Is eGPU possible for cuda offload? Link to comment
mirekti Posted August 4, 2017 Share Posted August 4, 2017 28 minutes ago, john925 said: Is eGPU possible for cuda offload? I'd say it is, but: 1. TB3 has its limits and you loose some of your GPU power. From what I've read, the better the card the more you lose. 2. External GPU cases are loud (in case you plan to keep it in the same room). 3. The eGPU cases are expensive. Vinnie Rossi LIO (AVC/Tubestage, AMP Module with built in HPF 100Hz 24dB/octave, DAC 2.0), Harbeth P3ESR, Rythmik F8 Win10 i7-7700 -> Roon -> HQPlayer DSD512- > LIO 100Hz HPF -> Harbeth P3ESR ->LIO -> miniDSP <100Hz -> Rythmik F8 Link to comment
bigbear2003 Posted August 24, 2017 Share Posted August 24, 2017 I tried the 940mx on my lenovo laptop and i dont see any improvement on cpu util. Link to comment
Miska Posted August 30, 2017 Share Posted August 30, 2017 On 8/24/2017 at 5:11 AM, bigbear2003 said: I tried the 940mx on my lenovo laptop and i dont see any improvement on cpu util. What is your output rate and filter settings? Signalyst - Developer of HQPlayer Pulse & Fidelity - Software Defined Amplifiers Link to comment
Ipoci Posted September 12, 2017 Share Posted September 12, 2017 On 8/24/2017 at 4:11 AM, bigbear2003 said: I tried the 940mx on my lenovo laptop and i dont see any improvement on cpu util. Did you check with <nvidia-smi> if hqplayer is using or not the card? My laptop is HP zBook G2 17" with i7-4940MX and Nvidia Quadro K5100M. In my case, hqp is always using Nvidia card or it seems to use it even if I deselect "cuda offload" ... So CPU% is always the same on the different cores 1-3% for plain DSD64 Have a nice day, Massimiliano Link to comment
ferenc Posted October 1, 2017 Share Posted October 1, 2017 On 2017. 07. 26. at 10:40 PM, Miska said: At least not explicitly. Depending on how Nvidia implemented some of the CUDA functionality, multiple GPUs could possibly work when you have also convolution enabled. But since I don't have any multi-GPU machines I have not checked whether this happens or not. This should be interesting with 16x 1080 Ti cards for example : https://www.onestopsystems.com/product/4u-value-gpu-accelerator-system Link to comment
yamamoto2002 Posted December 20, 2017 Share Posted December 20, 2017 It seems double precision arithmetic of Titan V card is 22 times faster than GTX 1080. Sunday programmer since 1985 Developer of PlayPcmWin Link to comment
Hoang Anh Posted May 11, 2018 Share Posted May 11, 2018 Since My PC (core i5 6400; 12GB RAM) doesn't support well GTX 1060 (need more power) I intent to use gtx 1050 with it. I want to ask is that any benefit. Since I know that too slow GPU will necklack HQplayer, I want to ask will be an option to choose how many % to seperate work for GPU and CPU. To make sure it work best. Link to comment
Hoang Anh Posted May 13, 2018 Share Posted May 13, 2018 Hope amd or older Quadro (6000 for example, used one price very low now, and have good fp64) will support in next version Hqplayer. There wI'll be more chooice. Link to comment
yamamoto2002 Posted June 12, 2018 Share Posted June 12, 2018 This is my Titan V result. About 6 TFLOPS doubleprec, One-seventh of Earth Simulator Gen1 Supercomputer . I hope upcoming Geforce Volta products may have some doubleprec capability Miska 1 Sunday programmer since 1985 Developer of PlayPcmWin Link to comment
louisxiawei Posted June 12, 2018 Share Posted June 12, 2018 26 minutes ago, yamamoto2002 said: This is my Titan V result. About 6 TFLOPS doubleprec, One-seventh of Earth Simulator Gen1 Supercomputer . I hope upcoming Geforce Volta products may have some doubleprec capability What a beast! Have you run any HQplayer heavy filter setting with it? Something like upsampling 44.1/16 → 48 x 512 using poly-sinc-xtr filter? Meanwhile AMD and Intel are having the "multi-core" CPU competition. All good for HQplayer Software: Roon, Tidal, HQplayer HQplayer PC: i9 7980XE, Titan Xp, RTX 3090; i9 9900K, Titan V DAC: Holo Audio MAY L2, T+A DAC8 DSD, exasound e12, iFi micro iDSD BL USB tweaks: Intona, Uptone (ISO) regen, LPS-1, LPS-1.2, Sbooster Vbus2, Curious cables, SUPRA Certified HiSpeed USB cable NAA: Logic CL100 powered by Uptone JS-2 AMP: Spectral DMC 30SV, Spectral DMA 300RS Speaker: Magico S3 MKII Rack: HRS SXR signature Link to comment
yamamoto2002 Posted June 14, 2018 Share Posted June 14, 2018 On 6/12/2018 at 11:31 PM, louisxiawei said: What a beast! Have you run any HQplayer heavy filter setting with it? Something like upsampling 44.1/16 → 48 x 512 using poly-sinc-xtr filter? No. It seems, in order to run CUDA programs on Volta, programs should be compiled using latest version of CUDA Toolkit, which dropped support of older Fermi based GPUs such as Geforce GTX 580 or Quadro 6000. I'm not sure this affects HQP > Meanwhile AMD and Intel are having the "multi-core" CPU competition. All good for HQplayer Yes it is good thing. On Windows, non-processor-group-aware apps can handle up to 64 core (or 64 hyper-thread). Process affinity mask is 64bit (one bit is associated to one core(or hyper-thread), so it can express up to 64 core(or hyper-thread)). With 32 core 64 thread CPU, all the available affinity mask bit is used and free performance improvement of multi thread app by increasing CPU core ends there. If this trend continues and say 64 core 128 thread CPU is arrived, app should be rewritten to use multiple processor groups to squeeze all the CPU resource. Sunday programmer since 1985 Developer of PlayPcmWin Link to comment
Miska Posted June 15, 2018 Share Posted June 15, 2018 10 hours ago, yamamoto2002 said: It seems, in order to run CUDA programs on Volta, programs should be compiled using latest version of CUDA Toolkit, which dropped support of older Fermi based GPUs such as Geforce GTX 580 or Quadro 6000. I'm not sure this affects HQP Latest HQPlayer Desktop 3.21 is compiled with latest CUDA 9.2. But already earlier versions compiled with CUDA 9.1 had full support for Volta. Availability of latest CUDA toolkit actually delayed my release, because Microsoft's update to Visual Studio 2017 broke the previous CUDA toolkit version... But the CUDA-Z test application you are running is compiled against CUDA 5 or 6 or something really old. 10 hours ago, yamamoto2002 said: Yes it is good thing. On Windows, non-processor-group-aware apps can handle up to 64 core (or 64 hyper-thread). Process affinity mask is 64bit (one bit is associated to one core(or hyper-thread), so it can express up to 64 core(or hyper-thread)). With 32 core 64 thread CPU, all the available affinity mask bit is used and free performance improvement of multi thread app by increasing CPU core ends there. If this trend continues and say 64 core 128 thread CPU is arrived, app should be rewritten to use multiple processor groups to squeeze all the CPU resource. Luckily I have very little Windows specific code and no limitations for number of CPU cores. So no need to rewrite anything... And I can also warmly recommend using Linux... yamamoto2002 1 Signalyst - Developer of HQPlayer Pulse & Fidelity - Software Defined Amplifiers Link to comment
yamamoto2002 Posted June 15, 2018 Share Posted June 15, 2018 3 hours ago, Miska said: But the CUDA-Z test application you are running is compiled against CUDA 5 or 6 or something really old. Thanks for your reply. I understood about CUDA binary forward compatibility and things are cleared up: https://docs.nvidia.com/cuda/volta-compatibility-guide/index.html Quote Applications that already include PTX versions of their kernels should work as-is on Volta-based GPUs. Applications that only support specific GPU architectures via cubin files, however, will need to be updated to provide Volta-compatible PTX or cubins. So, CUDA-Z contains PTX binary for forward compatibility and if you are lucky enough, PTX code runs well on future architecture. Sunday programmer since 1985 Developer of PlayPcmWin Link to comment
yamamoto2002 Posted June 24, 2018 Share Posted June 24, 2018 On 6/15/2018 at 6:29 PM, Miska said: Luckily I have very little Windows specific code and no limitations for number of CPU cores. So no need to rewrite anything... Last I checked, on Windows, one process can handle up to 64 HT. So, in order to handle 128 HT, another worker process should be created and each process create 64 threads, and two processes communicate with inter-process communication. This is significant rewrite from casual multi threading code of Sunday programmer Sunday programmer since 1985 Developer of PlayPcmWin Link to comment
Miska Posted June 24, 2018 Share Posted June 24, 2018 2 hours ago, yamamoto2002 said: Last I checked, on Windows, one process can handle up to 64 HT. So, in order to handle 128 HT, another worker process should be created and each process create 64 threads, and two processes communicate with inter-process communication. This is significant rewrite from casual multi threading code of Sunday programmer I'm not casual multi-threading Sunday programmer... I also know how to program HPC clusters / supercomputers. But I still recommend going for Linux if you have anything more than a traditional average PC. It is not my programming, it is about Microsoft's. Anyway, this is going off-topic for CUDA & HQPlayer. yamamoto2002 1 Signalyst - Developer of HQPlayer Pulse & Fidelity - Software Defined Amplifiers Link to comment
yamamoto2002 Posted June 30, 2018 Share Posted June 30, 2018 Thank you for your reply. I found Linux has much better multi threading support for casual Sunday programmer. Also there are several cross platform library to do overcome this kind of OS specific quirky. And sorry for topic drift. Sunday programmer since 1985 Developer of PlayPcmWin Link to comment
2a3set Posted July 20, 2018 Share Posted July 20, 2018 Wondering how to enable K20 for cuda offload correctly in ubuntu 16. hqplayer process is using correct gpu, but with cuda offload it hiccups more (using e5-2680v2 10 core xeon) +-----------------------------------------------------------------------------+ | NVIDIA-SMI 396.37 Driver Version: 396.37 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GT 710 Off | 00000000:02:00.0 N/A | N/A | | 40% 41C P8 N/A / N/A | 95MiB / 2000MiB | N/A Default | +-------------------------------+----------------------+----------------------+ | 1 Tesla K20Xm Off | 00000000:04:00.0 Off | Off | | N/A 77C P0 59W / 235W | 106MiB / 6083MiB | 0% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 Not Supported | | 1 4660 C /usr/bin/hqplayer 95MiB | +-----------------------------------------------------------------------------+ Link to comment
Whitigir Posted December 2, 2018 Share Posted December 2, 2018 I am going to buy a cheap CUDA 1030 with passive heatsink and hope it can do some bidding here Link to comment
Miska Posted December 4, 2018 Share Posted December 4, 2018 On 7/20/2018 at 10:32 PM, 2a3set said: Wondering how to enable K20 for cuda offload correctly in ubuntu 16. hqplayer process is using correct gpu, but with cuda offload it hiccups more (using e5-2680v2 10 core xeon) +-----------------------------------------------------------------------------+ | NVIDIA-SMI 396.37 Driver Version: 396.37 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GT 710 Off | 00000000:02:00.0 N/A | N/A | | 40% 41C P8 N/A / N/A | 95MiB / 2000MiB | N/A Default | +-------------------------------+----------------------+----------------------+ | 1 Tesla K20Xm Off | 00000000:04:00.0 Off | Off | | N/A 77C P0 59W / 235W | 106MiB / 6083MiB | 0% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 Not Supported | | 1 4660 C /usr/bin/hqplayer 95MiB | +-----------------------------------------------------------------------------+ Is this with latest HQPlayer version? If that is the case, your driver is too old (396), you need latest driver (>= 410) for CUDA 10 support. Does HQPlayer tell that the offload is enabled when you start playback? Now the GPU utilization is shown as 0% so things are probably working as they should... Signalyst - Developer of HQPlayer Pulse & Fidelity - Software Defined Amplifiers Link to comment
Miska Posted December 4, 2018 Share Posted December 4, 2018 5 hours ago, Miska said: Now the GPU utilization is shown as 0% so things are probably working as they should... Ehh; should be "not working as they should"... Signalyst - Developer of HQPlayer Pulse & Fidelity - Software Defined Amplifiers Link to comment
simonklp Posted December 6, 2018 Share Posted December 6, 2018 On Tuesday, December 04, 2018 at 11:14 PM, Miska said: Ehh; should be "not working as they should"... Hi Miska, for HQP Desktop, I understand that there is a message at the bottom left corner telling us that it is enabled during the first few seconds when the music starts playing. But for HQPE, how do we know that the CUDA offload is working properly? Thanks. Link to comment
Miska Posted December 6, 2018 Share Posted December 6, 2018 3 hours ago, simonklp said: But for HQPE, how do we know that the CUDA offload is working properly? I've now added indication about this on the front page status table. And also improved logging about this. Signalyst - Developer of HQPlayer Pulse & Fidelity - Software Defined Amplifiers Link to comment
simonklp Posted December 6, 2018 Share Posted December 6, 2018 26 minutes ago, Miska said: I've now added indication about this on the front page status table. And also improved logging about this. Hi Miska, noted with thanks. Do you mean that the front page of web configuration includes this status table? Link to comment
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now