CUDA代码一键迁移

此功能依赖cuda版本的torch

为了方便用户代码从 CUDA 迁移至 GCU 平台, torch_gcu 支持一键迁移功能, 仅需要在用户代码调用之前调用

from torch_gcu import transfer_to_gcu

即可支持用户代码在 GCU 硬件上执行

使用示例

 1import torch
 2try:
 3    import torch_gcu # 导入 torch_gcu
 4    from torch_gcu import transfer_to_gcu # 导入 transfer_to_gcu
 5except Exception as e:
 6    print(e)
 7
 8# 用户代码
 9print(f"torch.cuda.is_available(): {torch.cuda.is_available()}")
10
11a_tensor = torch.ones(3, 3).cuda()
12b_tensor = torch.ones(size=(3, 1), device="cuda")
13
14add_out = torch.add(a_tensor, b_tensor)
15print(add_out)

torch 接口兼容列表

表 3 torch 接口兼容列表

API

Is Supported

Comment

torch.logspace

Yes

replaced "cuda" or int arguments

torch.randint

Yes

replaced "cuda" or int arguments

torch.hann_window

Yes

replaced "cuda" or int arguments

torch.rand

Yes

replaced "cuda" or int arguments

torch.full_like

Yes

replaced "cuda" or int arguments

torch.ones_like

Yes

replaced "cuda" or int arguments

torch.rand_like

Yes

replaced "cuda" or int arguments

torch.randperm

Yes

replaced "cuda" or int arguments

torch.arange

Yes

replaced "cuda" or int arguments

torch.frombuffer

No

Pytorch has Bugs when Specifying device

torch.normal

Yes

replaced "cuda" or int argument

torch._empty_per_channel_ affine_quantized

No

GCU does not support _empty_per_channel_affine_quantized

torch.empty_strided

Yes

replaced "cuda" or int arguments

torch.empty_like

Yes

replaced "cuda" or int arguments

torch.scalar_tensor

Yes

replaced "cuda" or int arguments

torch.tril_indices

Yes

replaced "cuda" or int arguments

torch.bartlett_window

Yes

replaced "cuda" or int arguments

torch.ones

Yes

replaced "cuda" or int arguments

torch.sparse_coo_tensor

No

GCU does not support sparse tensor

torch.randn

Yes

replaced "cuda" or int arguments

torch.kaiser_window

No

DispatchStub: missing PrivateUse1 kernel

torch.tensor

Yes

replaced "cuda" or int arguments

torch.triu_indices

Yes

replaced "cuda" or int arguments

torch.as_tensor

Yes

replaced "cuda" or int arguments

torch.zeros

Yes

replaced "cuda" or int arguments

torch.randint_like

Yes

replaced "cuda" or int arguments

torch.full

Yes

replaced "cuda" or int arguments

torch.eye

Yes

replaced "cuda" or int arguments

torch._sparse_csr_tensor_unsafe

No

GCU does not support sparse tensor

torch.blackman_window

Yes

replaced "cuda" or int arguments

torch.zeros_like

Yes

replaced "cuda" or int arguments

torch.range

Yes

replaced "cuda" or int arguments

torch.sparse_csr_tensor

No

GCU does not support sparse tensor

torch.randn_like

Yes

replaced "cuda" or int arguments

torch._cudnn_init_dropout_state

No

GCU does not support _cudnn_init_dropout_state

torch._empty_affine_quantized

No

GCU does not support quint8

torch.linspace

Yes

replaced "cuda" or int arguments

torch.hamming_window

Yes

replaced "cuda" or int arguments

torch.empty_quantized

No

GCU does not support quint8

torch._pin_memory

Yes

replaced "cuda" or int arguments

torch.from_file

No

CUDA does not Support Specifying device

Warning

torch.jit.script 模块被整体禁用

torch.cuda 接口兼容列表

表 4 torch.cuda 接口兼容列表

API

Is Supported

Comment

torch.cuda.get_device_properties

Yes

replaced "cuda" or int arguments, replaced with torch.gcu.get_device_properties

torch.cuda.get_device_name

Yes

replaced with torch.gcu.get_device_name

torch.cuda.get_device_capability

Yes

torch.gcu.get_device_capability

torch.cuda.list_gpu_processes

No

torch.cuda.set_device

Yes

replaced "cuda" or int arguments, replaced with torch.gcu.set_device

torch.cuda.synchronize

Yes

replaced with torch.gcu.synchronize

torch.cuda.mem_get_info

Yes

replaced with torch.gcu.mem_get_info

torch.cuda.memory_stats

Yes

replaced with torch.gcu.memory_stats

torch.cuda.memory_summary

Yes

replaced with torch.gcu.memory_summary

torch.cuda.memory_allocated

Yes

replaced with torch.gcu.memory_allocated

torch.cuda.max_memory_allocated

Yes

replaced with torch.gcu.max_memory_allocated

torch.cuda.reset_max_memory_allocated

Yes

replaced with torch.gcu.reset_max_memory_allocated

torch.cuda.memory_reserved

Yes

replaced with torch.gcu.memory_reserved

torch.cuda.max_memory_reserved

Yes

replaced with torch.gcu.max_memory_reserved

torch.cuda.reset_max_memory_cached

Yes

replaced with torch.gcu.reset_max_memory_cached

torch.cuda.is_bf16_supported

Yes

replaced with torch.gcu.is_bf16_supported

torch.cuda.StreamContext

Yes

replaced with torch.gcu.StreamContext

torch.cuda.init

Yes

replaced with torch.gcu.init

torch.cuda._lazy_call

Yes

replaced with torch.gcu._lazy_call

torch.cuda._lazy_init

Yes

replaced with torch.gcu.init

torch.cuda._tls

Yes

replaced with torch.gcu._tls

torch.cuda._initialization_lock

Yes

replaced with torch.gcu._initialization_lock

torch.cuda._queued_calls

Yes

replaced with torch.gcu._queued_calls

torch.cuda._is_in_bad_fork

Yes

replaced with torch.gcu._is_in_bad_fork

torch.cuda._LazySeedTracker

Yes

replaced with torch.gcu._LazySeedTracker

torch.cuda.LazyProperty

Yes

replaced with torch.gcu.LazyProperty

torch.cuda._lazy_seed_tracker

Yes

replaced with torch.gcu._lazy_seed_tracker

torch.cuda.DoubleTensor

Yes

replaced with torch.gcu.DoubleTensor

torch.cuda.FloatTensor

Yes

replaced with torch.gcu.FloatTensor

torch.cuda.HalfTensor

Yes

replaced with torch.gcu.HalfTensor

torch.cuda.BFloat16Tensor

Yes

replaced with torch.gcu.BFloat16Tensor

torch.cuda.LongTensor

Yes

replaced with torch.gcu.LongTensor

torch.cuda.IntTensor

Yes

replaced with torch.gcu.IntTensor

torch.cuda.ShortTensor

Yes

replaced with torch.gcu.ShortTensor

torch.cuda.BoolTensor

Yes

replaced with torch.gcu.BoolTensor

torch.cuda.CharTensor

Yes

replaced with torch.gcu.CharTensor

torch.cuda.ByteTensor

Yes

replaced with torch.gcu.ByteTensor

torch.cuda.nvtx 接口兼容列表

torch.cuda.nvtx 模块在转换脚本导入后被自动替换为torch.gcu.topstx

表 5 torch.cuda.nvtx 接口兼容列表

API

Is Supported

Comment

torch.cuda.nvtx.range_push

Yes

replaced with torch.gcu.topstx.range_push

torch.cuda.nvtx.range_pop

Yes

replaced with torch.gcu.topstx.range_pop

torch.cuda.nvtx.range_start

Yes

replaced with torch.gcu.topstx.range_start

torch.cuda.nvtx.range_end

Yes

replaced with torch.gcu.topstx.range_end

torch.cuda.nvtx.mark

Yes

replaced with torch.gcu.topstx.mark

torch.cuda.nvtx.range

Yes

replaced with torch.gcu.topstx.range

torch.Tenosr 接口兼容列表

表 6 torch.Tenosr 接口兼容列表

API

Is Supported

Comment

torch.Tensor.new_empty

Yes

replaced "cuda" or int arguments

torch.Tensor.new_empty_strided

Yes

replaced "cuda" or int arguments

torch.Tensor.new_full

Yes

replaced "cuda" or int arguments

torch.Tensor.new_ones

Yes

replaced "cuda" or int arguments

torch.Tensor.new_tensor

Yes

replaced "cuda" or int arguments

torch.Tensor.new_zeros

Yes

replaced "cuda" or int arguments

torch.Tensor.to

Yes

replaced "cuda" or int arguments

torch.Tensor.cuda

Yes

torch.Tensor.cuda -> torch.Tensor.gcu

torch.Tensor.is_cuda

Yes

torch.Tensor.is_cuda -> torch.Tensor.is_gcu

torch.nn.Module 接口兼容列表

表 7 torch.nn.Module 接口兼容列表

API

Is Supported

Comment

torch.nn.Module.to

Yes

replaced "cuda" or int arguments

torch.nn.Module.to_empty

Yes

replaced "cuda" or int arguments

torch.nn.Module.cuda

Yes

torch.nn.Module.cuda -> torch.nn.Module.gcu

NCCL/分布式 接口兼容列表

表 8 NCCL/分布式 接口兼容列表

API

Is Supported

Comment

torch.distributed.init_process_group

Yes

replaced “nccl” / “NCCL” arguments to “eccl” / “ECCL”

torch.distributed.is_nccl_available

Yes

change to torch.distributed.is_eccl_available

torch.nn.parallel.DistributedDataParallel.__init__

Yes

replaced “nccl” / “NCCL” arguments to “eccl” / “ECCL”

DataLoader 接口兼容列表

表 9 DataLoader 接口兼容列表

API

Is Supported

Comment

torch.utils.data.DataLoader.__init__

Yes

replaced "cuda" or int arguments

置空禁用的 torch.cuda 接口列表

表 10 置空禁用的 torch.cuda 接口列表

API

Is Supported

Comment

torch.cuda.is_current_stream_capturing

Yes

lambda: False

torch.cuda.memory_snapshot

Yes

lambda: None

torch.cuda.amp.common.amp_definitely_not_available

Yes

lambda: False

torch.cuda.ipc_collect

Yes

lambda *args, **kwargs: None

torch.cuda.utilization

Yes

lambda *args, **kwargs: 0

torch.cuda.graph

Yes

contextlib.nullcontext()