A data science team is preparing a large text dataset for fine-tuning a Llama 3 model. The dataset consists of 500GB of raw text files. The team needs to perform tokenization and data cleaning as quickly as possible. Which NVIDIA library is specifically designed for GPU-accelerated data manipulation and would be most suitable for this task?