Skip to content

feat(ascend): add Ascend framework layer — runtime, type mapping, bui…#46

Open
zhangyue207 wants to merge 12 commits intomasterfrom
feat/ascend-framework
Open

feat(ascend): add Ascend framework layer — runtime, type mapping, bui…#46
zhangyue207 wants to merge 12 commits intomasterfrom
feat/ascend-framework

Conversation

@zhangyue207
Copy link
Copy Markdown
Collaborator

…ld integration

Add Ascend platform scaffolding:

  • device_.h: DeviceEnabled<kAscend> specialization
  • data_type_.h: toAclDtype(), isIntegerDtype()
  • common.h: buildAclTensor() with optional transpose
  • workspace_pool_.h: stream-keyed workspace allocator
  • runtime_.h: Runtime<kAscend> (Malloc, Free, Memcpy, Memset)
  • 5 new operator base classes (AddRmsNorm, FlashAttention, Matmul, ReshapeAndCache, RotaryEmbedding)

Integrate into CMake build system, Python binding generation (stream + optional tensor support), and examples runtime API.

@zhangyue207 zhangyue207 force-pushed the feat/ascend-framework branch 2 times, most recently from fb9f42f to 62fb25a Compare April 10, 2026 04:06
@zhangyue207
Copy link
Copy Markdown
Collaborator Author

nv

(python3.10) zhangyue@server:~/InfiniOps$ python .ci/run.py --local --test "pip install .[dev]"
platform: nvidia
==> running job: nvidia_gpu

=============
== PyTorch ==
=============

NVIDIA Release 25.12 (build 245654590)
PyTorch Version 2.10.0a0+b4e4ee8
Container image Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
Copyright (c) 2014-2024 Facebook Inc.
Copyright (c) 2011-2014 Idiap Research Institute (Ronan Collobert)
Copyright (c) 2012-2014 Deepmind Technologies    (Koray Kavukcuoglu)
Copyright (c) 2011-2012 NEC Laboratories America (Koray Kavukcuoglu)
Copyright (c) 2011-2013 NYU                      (Clement Farabet)
Copyright (c) 2006-2010 NEC Laboratories America (Ronan Collobert, Leon Bottou, Iain Melvin, Jason Weston)
Copyright (c) 2006      Idiap Research Institute (Samy Bengio)
Copyright (c) 2001-2004 Idiap Research Institute (Ronan Collobert, Samy Bengio, Johnny Mariethoz)
Copyright (c) 2015      Google Inc.
Copyright (c) 2015      Yangqing Jia
Copyright (c) 2013-2016 The Caffe contributors
All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

GOVERNING TERMS: The software and materials are governed by the NVIDIA Software License Agreement
(found at https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-software-license-agreement/)
and the Product-Specific Terms for NVIDIA AI Products
(found at https://www.nvidia.com/en-us/agreements/enterprise-software/product-specific-terms-for-ai-products/).

NOTE: CUDA Forward Compatibility mode ENABLED.
  Using CUDA 13.1 driver version 590.44.01 with kernel driver version 580.105.08.
  See https://docs.nvidia.com/deploy/cuda-compatibility/ for details.

NOTE: Mellanox network driver detected, but NVIDIA peer memory driver not
      detected.  Multi-node communication performance may be reduced.

========== Setup ==========
Processing ./.
  Preparing metadata (pyproject.toml): started
  Preparing metadata (pyproject.toml): finished with status 'done'
Requirement already satisfied: pytest in /usr/local/lib/python3.12/dist-packages (from InfiniOps==0.1.0) (8.1.1)
Requirement already satisfied: pytest-cov in /usr/local/lib/python3.12/dist-packages (from InfiniOps==0.1.0) (7.1.0)
Requirement already satisfied: pytest-xdist in /usr/local/lib/python3.12/dist-packages (from InfiniOps==0.1.0) (3.8.0)
Requirement already satisfied: ruff in /usr/local/lib/python3.12/dist-packages (from InfiniOps==0.1.0) (0.15.7)
Requirement already satisfied: torch in /usr/local/lib/python3.12/dist-packages (from InfiniOps==0.1.0) (2.10.0a0+b4e4ee81d3.nv25.12)
Requirement already satisfied: pyyaml in /usr/local/lib/python3.12/dist-packages (from InfiniOps==0.1.0) (6.0.3)
Requirement already satisfied: iniconfig in /usr/local/lib/python3.12/dist-packages (from pytest->InfiniOps==0.1.0) (2.3.0)
Requirement already satisfied: packaging in /usr/local/lib/python3.12/dist-packages (from pytest->InfiniOps==0.1.0) (25.0)
Requirement already satisfied: pluggy<2.0,>=1.4 in /usr/local/lib/python3.12/dist-packages (from pytest->InfiniOps==0.1.0) (1.6.0)
Requirement already satisfied: coverage>=7.10.6 in /usr/local/lib/python3.12/dist-packages (from coverage[toml]>=7.10.6->pytest-cov->InfiniOps==0.1.0) (7.13.5)
Requirement already satisfied: execnet>=2.1 in /usr/local/lib/python3.12/dist-packages (from pytest-xdist->InfiniOps==0.1.0) (2.1.2)
Requirement already satisfied: filelock in /usr/local/lib/python3.12/dist-packages (from torch->InfiniOps==0.1.0) (3.20.1)
Requirement already satisfied: typing-extensions>=4.10.0 in /usr/local/lib/python3.12/dist-packages (from torch->InfiniOps==0.1.0) (4.15.0)
Requirement already satisfied: setuptools in /usr/local/lib/python3.12/dist-packages (from torch->InfiniOps==0.1.0) (80.9.0)
Requirement already satisfied: sympy>=1.13.3 in /usr/local/lib/python3.12/dist-packages (from torch->InfiniOps==0.1.0) (1.14.0)
Requirement already satisfied: networkx>=2.5.1 in /usr/local/lib/python3.12/dist-packages (from torch->InfiniOps==0.1.0) (3.6.1)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.12/dist-packages (from torch->InfiniOps==0.1.0) (3.1.6)
Requirement already satisfied: fsspec>=0.8.5 in /usr/local/lib/python3.12/dist-packages (from torch->InfiniOps==0.1.0) (2025.10.0)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in /usr/local/lib/python3.12/dist-packages (from sympy>=1.13.3->torch->InfiniOps==0.1.0) (1.3.0)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.12/dist-packages (from jinja2->torch->InfiniOps==0.1.0) (3.0.3)
Building wheels for collected packages: InfiniOps
  Building wheel for InfiniOps (pyproject.toml): started
  Building wheel for InfiniOps (pyproject.toml): still running...
  Building wheel for InfiniOps (pyproject.toml): finished with status 'done'
  Created wheel for InfiniOps: filename=infiniops-0.1.0-cp312-cp312-linux_x86_64.whl size=413715 sha256=8367037ce13c5f1ab0e31525eadbaa09340d965c83754692673ff9ad86e75816
  Stored in directory: /tmp/pip-ephem-wheel-cache-mxypsk0a/wheels/b1/a4/57/72f62aaee401db75e8c1c3aca62878014646bf19d33c76d6bf
Successfully built InfiniOps
Installing collected packages: InfiniOps
Successfully installed InfiniOps-0.1.0

@zhangyue207
Copy link
Copy Markdown
Collaborator Author

metax

zhangyue@test:~/InfiniOps$ python3 .ci/run.py --local --test "pip install .[dev]"
platform: metax
==> running job: metax_gpu
========== Setup ==========
Looking in indexes: http://mirrors.aliyun.com/pypi/simple
Processing /tmp/src
  Preparing metadata (pyproject.toml): started
  Preparing metadata (pyproject.toml): finished with status 'done'
Requirement already satisfied: pytest in /opt/conda/lib/python3.10/site-packages (from InfiniOps==0.1.0) (8.4.1)
Requirement already satisfied: pytest-cov in /opt/conda/lib/python3.10/site-packages (from InfiniOps==0.1.0) (7.1.0)
Requirement already satisfied: pytest-xdist in /opt/conda/lib/python3.10/site-packages (from InfiniOps==0.1.0) (3.8.0)
Requirement already satisfied: ruff in /opt/conda/lib/python3.10/site-packages (from InfiniOps==0.1.0) (0.15.7)
Requirement already satisfied: torch in /opt/conda/lib/python3.10/site-packages (from InfiniOps==0.1.0) (2.4.0+metax3.2.1.3)
Requirement already satisfied: pyyaml in /opt/conda/lib/python3.10/site-packages (from InfiniOps==0.1.0) (6.0.3)
Requirement already satisfied: exceptiongroup>=1 in /opt/conda/lib/python3.10/site-packages (from pytest->InfiniOps==0.1.0) (1.3.0)
Requirement already satisfied: iniconfig>=1 in /opt/conda/lib/python3.10/site-packages (from pytest->InfiniOps==0.1.0) (2.1.0)
Requirement already satisfied: packaging>=20 in /opt/conda/lib/python3.10/site-packages (from pytest->InfiniOps==0.1.0) (25.0)
Requirement already satisfied: pluggy<2,>=1.5 in /opt/conda/lib/python3.10/site-packages (from pytest->InfiniOps==0.1.0) (1.6.0)
Requirement already satisfied: pygments>=2.7.2 in /opt/conda/lib/python3.10/site-packages (from pytest->InfiniOps==0.1.0) (2.19.2)
Requirement already satisfied: tomli>=1 in /opt/conda/lib/python3.10/site-packages (from pytest->InfiniOps==0.1.0) (2.3.0)
Requirement already satisfied: coverage>=7.10.6 in /opt/conda/lib/python3.10/site-packages (from coverage[toml]>=7.10.6->pytest-cov->InfiniOps==0.1.0) (7.11.0)
Requirement already satisfied: execnet>=2.1 in /opt/conda/lib/python3.10/site-packages (from pytest-xdist->InfiniOps==0.1.0) (2.1.2)
Requirement already satisfied: filelock in /opt/conda/lib/python3.10/site-packages (from torch->InfiniOps==0.1.0) (3.20.0)
Requirement already satisfied: typing-extensions>=4.8.0 in /opt/conda/lib/python3.10/site-packages (from torch->InfiniOps==0.1.0) (4.15.0)
Requirement already satisfied: sympy in /opt/conda/lib/python3.10/site-packages (from torch->InfiniOps==0.1.0) (1.14.0)
Requirement already satisfied: networkx in /opt/conda/lib/python3.10/site-packages (from torch->InfiniOps==0.1.0) (3.4.2)
Requirement already satisfied: jinja2 in /opt/conda/lib/python3.10/site-packages (from torch->InfiniOps==0.1.0) (3.1.6)
Requirement already satisfied: fsspec in /opt/conda/lib/python3.10/site-packages (from torch->InfiniOps==0.1.0) (2025.5.1)
Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/lib/python3.10/site-packages (from jinja2->torch->InfiniOps==0.1.0) (3.0.2)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in /opt/conda/lib/python3.10/site-packages (from sympy->torch->InfiniOps==0.1.0) (1.3.0)
Building wheels for collected packages: InfiniOps
  Building wheel for InfiniOps (pyproject.toml): started
  Building wheel for InfiniOps (pyproject.toml): still running...
  Building wheel for InfiniOps (pyproject.toml): finished with status 'done'
  Created wheel for InfiniOps: filename=infiniops-0.1.0-cp310-cp310-linux_x86_64.whl size=775970 sha256=36a3fde2e0ab714bf66aade75b7bcc46f55d151bb6fcf8e26b83af3a22a1c27d
  Stored in directory: /tmp/pip-ephem-wheel-cache-rjx7fndp/wheels/ac/4c/a5/78fe3376fbe0f633e8ad47ec3e677a6762cbf147a5e0195bab
Successfully built InfiniOps
Installing collected packages: InfiniOps
Successfully installed InfiniOps-0.1.0

@zhangyue207
Copy link
Copy Markdown
Collaborator Author

zhangyue207 commented Apr 10, 2026

iluvatar

(python3.10) zhangyue@iluvatar:~/InfiniOps$ python .ci/run.py --local --test "pip install .[dev]"
platform: iluvatar
==> running job: iluvatar_gpu
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
========== Setup ==========
Processing ./.
  Preparing metadata (pyproject.toml): started
  Preparing metadata (pyproject.toml): finished with status 'done'
Requirement already satisfied: pytest in /usr/local/lib/python3.10/site-packages (from InfiniOps==0.1.0) (9.0.2)
Requirement already satisfied: pytest-cov in /usr/local/lib/python3.10/site-packages (from InfiniOps==0.1.0) (7.0.0)
Requirement already satisfied: pytest-xdist in /usr/local/lib/python3.10/site-packages (from InfiniOps==0.1.0) (3.8.0)
Requirement already satisfied: ruff in /usr/local/lib/python3.10/site-packages (from InfiniOps==0.1.0) (0.15.7)
Requirement already satisfied: torch in /usr/local/corex-4.3.0.20250624/lib64/python3/dist-packages (from InfiniOps==0.1.0) (2.4.1+corex.4.3.0.20250624)
Requirement already satisfied: pyyaml in /usr/local/corex-4.3.0.20250624/lib64/python3/dist-packages (from InfiniOps==0.1.0) (6.0.2)
Requirement already satisfied: exceptiongroup>=1 in /usr/local/corex-4.3.0.20250624/lib64/python3/dist-packages (from pytest->InfiniOps==0.1.0) (1.3.0)
Requirement already satisfied: iniconfig>=1.0.1 in /usr/local/lib/python3.10/site-packages (from pytest->InfiniOps==0.1.0) (2.3.0)
Requirement already satisfied: packaging>=22 in /usr/local/corex-4.3.0.20250624/lib64/python3/dist-packages (from pytest->InfiniOps==0.1.0) (25.0)
Requirement already satisfied: pluggy<2,>=1.5 in /usr/local/lib/python3.10/site-packages (from pytest->InfiniOps==0.1.0) (1.6.0)
Requirement already satisfied: pygments>=2.7.2 in /usr/local/corex-4.3.0.20250624/lib64/python3/dist-packages (from pytest->InfiniOps==0.1.0) (2.19.2)
Requirement already satisfied: tomli>=1 in /usr/local/lib/python3.10/site-packages (from pytest->InfiniOps==0.1.0) (2.4.0)
Requirement already satisfied: typing-extensions>=4.6.0 in /usr/local/corex-4.3.0.20250624/lib64/python3/dist-packages (from exceptiongroup>=1->pytest->InfiniOps==0.1.0) (4.14.0)
Requirement already satisfied: coverage>=7.10.6 in /usr/local/lib/python3.10/site-packages (from coverage[toml]>=7.10.6->pytest-cov->InfiniOps==0.1.0) (7.13.5)
Requirement already satisfied: execnet>=2.1 in /usr/local/lib/python3.10/site-packages (from pytest-xdist->InfiniOps==0.1.0) (2.1.2)
Requirement already satisfied: filelock in /usr/local/corex-4.3.0.20250624/lib64/python3/dist-packages (from torch->InfiniOps==0.1.0) (3.18.0)
Requirement already satisfied: sympy in /usr/local/corex-4.3.0.20250624/lib64/python3/dist-packages (from torch->InfiniOps==0.1.0) (1.14.0)
Requirement already satisfied: networkx in /usr/local/corex-4.3.0.20250624/lib64/python3/dist-packages (from torch->InfiniOps==0.1.0) (3.4.2)
Requirement already satisfied: jinja2 in /usr/local/corex-4.3.0.20250624/lib64/python3/dist-packages (from torch->InfiniOps==0.1.0) (3.1.6)
Requirement already satisfied: fsspec in /usr/local/corex-4.3.0.20250624/lib64/python3/dist-packages (from torch->InfiniOps==0.1.0) (2025.5.1)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/corex-4.3.0.20250624/lib64/python3/dist-packages (from jinja2->torch->InfiniOps==0.1.0) (3.0.2)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in /usr/local/corex-4.3.0.20250624/lib64/python3/dist-packages (from sympy->torch->InfiniOps==0.1.0) (1.3.0)
Building wheels for collected packages: InfiniOps
  Building wheel for InfiniOps (pyproject.toml): started
  Building wheel for InfiniOps (pyproject.toml): finished with status 'done'
  Created wheel for InfiniOps: filename=infiniops-0.1.0-cp310-cp310-linux_x86_64.whl size=379218 sha256=6a36a0d91c29d2ff0c7ad6f97eaccf50d91c821a43e2607dd2e135aae74fec68
  Stored in directory: /tmp/pip-ephem-wheel-cache-26j0w2jf/wheels/ac/4c/a5/78fe3376fbe0f633e8ad47ec3e677a6762cbf147a5e0195bab
Successfully built InfiniOps
Installing collected packages: InfiniOps
Successfully installed InfiniOps-0.1.0

@zhangyue207
Copy link
Copy Markdown
Collaborator Author

cambricon

[zhangyue@localhost InfiniOps]$ python .ci/run.py --local --test "pip install .[dev]"
platform: cambricon
==> running job: cambricon_gpu
========== Setup ==========
Looking in indexes: http://mirrors.aliyun.com/pypi/simple
Processing /tmp/src
  Preparing metadata (pyproject.toml): started
  Preparing metadata (pyproject.toml): finished with status 'done'
Requirement already satisfied: ruff in /usr/local/python3.10/lib/python3.10/site-packages (from InfiniOps==0.1.0) (0.15.7)
Requirement already satisfied: pytest-xdist in /usr/local/python3.10/lib/python3.10/site-packages (from InfiniOps==0.1.0) (3.8.0)
Requirement already satisfied: pytest-cov in /usr/local/python3.10/lib/python3.10/site-packages (from InfiniOps==0.1.0) (7.1.0)
Requirement already satisfied: pytest in /usr/local/python3.10/lib/python3.10/site-packages (from InfiniOps==0.1.0) (9.0.2)
Requirement already satisfied: pyyaml in /usr/local/python3.10/lib/python3.10/site-packages (from InfiniOps==0.1.0) (5.3.1)
Requirement already satisfied: torch in /usr/local/python3.10/lib/python3.10/site-packages (from InfiniOps==0.1.0) (2.1.0)
Requirement already satisfied: pluggy<2,>=1.5 in /usr/local/python3.10/lib/python3.10/site-packages (from pytest->InfiniOps==0.1.0) (1.6.0)
Requirement already satisfied: iniconfig>=1.0.1 in /usr/local/python3.10/lib/python3.10/site-packages (from pytest->InfiniOps==0.1.0) (2.3.0)
Requirement already satisfied: tomli>=1 in /usr/local/python3.10/lib/python3.10/site-packages (from pytest->InfiniOps==0.1.0) (2.4.0)
Requirement already satisfied: pygments>=2.7.2 in /usr/local/python3.10/lib/python3.10/site-packages (from pytest->InfiniOps==0.1.0) (2.19.2)
Requirement already satisfied: exceptiongroup>=1 in /usr/local/python3.10/lib/python3.10/site-packages (from pytest->InfiniOps==0.1.0) (1.3.0)
Requirement already satisfied: packaging>=22 in /usr/local/python3.10/lib/python3.10/site-packages (from pytest->InfiniOps==0.1.0) (25.0)
Requirement already satisfied: coverage[toml]>=7.10.6 in /usr/local/python3.10/lib/python3.10/site-packages (from pytest-cov->InfiniOps==0.1.0) (7.13.5)
Requirement already satisfied: execnet>=2.1 in /usr/local/python3.10/lib/python3.10/site-packages (from pytest-xdist->InfiniOps==0.1.0) (2.1.2)
Requirement already satisfied: sympy in /usr/local/python3.10/lib/python3.10/site-packages (from torch->InfiniOps==0.1.0) (1.14.0)
Requirement already satisfied: networkx in /usr/local/python3.10/lib/python3.10/site-packages (from torch->InfiniOps==0.1.0) (3.4.2)
Requirement already satisfied: fsspec in /usr/local/python3.10/lib/python3.10/site-packages (from torch->InfiniOps==0.1.0) (2025.5.1)
Requirement already satisfied: jinja2 in /usr/local/python3.10/lib/python3.10/site-packages (from torch->InfiniOps==0.1.0) (3.1.6)
Requirement already satisfied: typing-extensions in /usr/local/python3.10/lib/python3.10/site-packages (from torch->InfiniOps==0.1.0) (4.14.0)
Requirement already satisfied: filelock in /usr/local/python3.10/lib/python3.10/site-packages (from torch->InfiniOps==0.1.0) (3.18.0)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/python3.10/lib/python3.10/site-packages (from jinja2->torch->InfiniOps==0.1.0) (3.0.2)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in /usr/local/python3.10/lib/python3.10/site-packages (from sympy->torch->InfiniOps==0.1.0) (1.3.0)
Building wheels for collected packages: InfiniOps
  Building wheel for InfiniOps (pyproject.toml): started
  Building wheel for InfiniOps (pyproject.toml): still running...
  Building wheel for InfiniOps (pyproject.toml): finished with status 'done'
  Created wheel for InfiniOps: filename=infiniops-0.1.0-cp310-cp310-linux_aarch64.whl size=196128 sha256=f8058928250eae585c7978caacc0756ad5e798b8e2c2464bbec3f230b1aac2a0
  Stored in directory: /tmp/pip-ephem-wheel-cache-rbe8v3ag/wheels/ac/4c/a5/78fe3376fbe0f633e8ad47ec3e677a6762cbf147a5e0195bab
Successfully built InfiniOps
Installing collected packages: InfiniOps
Successfully installed InfiniOps-0.1.0

@zhangyue207
Copy link
Copy Markdown
Collaborator Author

zhangyue207 commented Apr 10, 2026

moore

zhangyue@mccx:~/InfiniOps$ python3 .ci/run.py --local --test "pip install .[dev]"
platform: moore
==> running job: moore_gpu
========== Setup ==========
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Processing /tmp/src
  Preparing metadata (pyproject.toml): started
  Preparing metadata (pyproject.toml): finished with status 'done'
Requirement already satisfied: pytest in /usr/local/lib/python3.10/dist-packages (from InfiniOps==0.1.0) (7.2.2)
Requirement already satisfied: pytest-cov in /usr/local/lib/python3.10/dist-packages (from InfiniOps==0.1.0) (7.1.0)
Requirement already satisfied: pytest-xdist in /usr/local/lib/python3.10/dist-packages (from InfiniOps==0.1.0) (3.8.0)
Requirement already satisfied: ruff in /usr/local/lib/python3.10/dist-packages (from InfiniOps==0.1.0) (0.15.7)
Requirement already satisfied: torch in /usr/local/lib/python3.10/dist-packages (from InfiniOps==0.1.0) (2.5.0)
Requirement already satisfied: pyyaml in /usr/local/lib/python3.10/dist-packages (from InfiniOps==0.1.0) (6.0.2)
Requirement already satisfied: attrs>=19.2.0 in /usr/local/lib/python3.10/dist-packages (from pytest->InfiniOps==0.1.0) (25.3.0)
Requirement already satisfied: iniconfig in /usr/local/lib/python3.10/dist-packages (from pytest->InfiniOps==0.1.0) (2.1.0)
Requirement already satisfied: packaging in /usr/local/lib/python3.10/dist-packages (from pytest->InfiniOps==0.1.0) (24.2)
Requirement already satisfied: pluggy<2.0,>=0.12 in /usr/local/lib/python3.10/dist-packages (from pytest->InfiniOps==0.1.0) (1.6.0)
Requirement already satisfied: exceptiongroup>=1.0.0rc8 in /usr/local/lib/python3.10/dist-packages (from pytest->InfiniOps==0.1.0) (1.3.0)
Requirement already satisfied: tomli>=1.0.0 in /usr/local/lib/python3.10/dist-packages (from pytest->InfiniOps==0.1.0) (2.2.1)
Requirement already satisfied: typing-extensions>=4.6.0 in /usr/local/lib/python3.10/dist-packages (from exceptiongroup>=1.0.0rc8->pytest->InfiniOps==0.1.0) (4.15.0)
Requirement already satisfied: coverage>=7.10.6 in /usr/local/lib/python3.10/dist-packages (from coverage[toml]>=7.10.6->pytest-cov->InfiniOps==0.1.0) (7.13.5)
Requirement already satisfied: execnet>=2.1 in /usr/local/lib/python3.10/dist-packages (from pytest-xdist->InfiniOps==0.1.0) (2.1.2)
Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from torch->InfiniOps==0.1.0) (3.19.1)
Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch->InfiniOps==0.1.0) (3.4.2)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch->InfiniOps==0.1.0) (3.1.6)
Requirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages (from torch->InfiniOps==0.1.0) (2025.9.0)
Requirement already satisfied: sympy==1.13.1 in /usr/local/lib/python3.10/dist-packages (from torch->InfiniOps==0.1.0) (1.13.1)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in /usr/local/lib/python3.10/dist-packages (from sympy==1.13.1->torch->InfiniOps==0.1.0) (1.3.0)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch->InfiniOps==0.1.0) (3.0.2)
Building wheels for collected packages: InfiniOps
  Building wheel for InfiniOps (pyproject.toml): started
  Building wheel for InfiniOps (pyproject.toml): still running...
  Building wheel for InfiniOps (pyproject.toml): finished with status 'done'
  Created wheel for InfiniOps: filename=infiniops-0.1.0-cp310-cp310-linux_x86_64.whl size=403085 sha256=a972c7b1e46c3bdf81d72263f52734d563a374ed267d66c8efd69676890a0741
  Stored in directory: /tmp/pip-ephem-wheel-cache-49b8ut6h/wheels/ac/4c/a5/78fe3376fbe0f633e8ad47ec3e677a6762cbf147a5e0195bab
Successfully built InfiniOps
Installing collected packages: InfiniOps
Successfully installed InfiniOps-0.1.0

@zhangyue207
Copy link
Copy Markdown
Collaborator Author

ascend

tests/test_gemm.py::test_gemm[npu-dtype2-0.01-0.01-1-True-True-1-1-a_shape0-b_shape0-c_shape0-None-None-None]
[gw0] [ 99%] SKIPPED tests/test_gemm.py::test_gemm[npu-dtype2-0.01-0.01-1-True-True-1-1-a_shape0-b_shape0-c_shape0-None-None-None]
tests/test_gemm.py::test_gemm[npu-dtype2-0.01-0.01-1-True-True-1-1-a_shape1-b_shape1-c_shape1-None-None-None]
[gw0] [ 99%] SKIPPED tests/test_gemm.py::test_gemm[npu-dtype2-0.01-0.01-1-True-True-1-1-a_shape1-b_shape1-c_shape1-None-None-None]
tests/test_gemm.py::test_gemm[npu-dtype2-0.01-0.01-1-True-True-1-1-a_shape2-b_shape2-c_shape2-a_strides2-b_strides2-c_strides2]
[gw0] [ 99%] SKIPPED tests/test_gemm.py::test_gemm[npu-dtype2-0.01-0.01-1-True-True-1-1-a_shape2-b_shape2-c_shape2-a_strides2-b_strides2-c_strides2]
tests/test_gemm.py::test_gemm[npu-dtype2-0.01-0.01-1-True-True-1-1-a_shape3-b_shape3-c_shape3-a_strides3-b_strides3-c_strides3]
[gw0] [ 99%] SKIPPED tests/test_gemm.py::test_gemm[npu-dtype2-0.01-0.01-1-True-True-1-1-a_shape3-b_shape3-c_shape3-a_strides3-b_strides3-c_strides3]
tests/test_gemm.py::test_gemm[npu-dtype2-0.01-0.01-1-True-True-1-1-a_shape4-b_shape4-c_shape4-None-None-None]
[gw0] [100%] SKIPPED tests/test_gemm.py::test_gemm[npu-dtype2-0.01-0.01-1-True-True-1-1-a_shape4-b_shape4-c_shape4-None-None-None]

----------- generated xml file: /workspace/results/test-results.xml ------------
===================== 1500 passed, 1500 skipped in 29.93s ======================
========== Summary ==========

zhangyue and others added 12 commits April 11, 2026 00:36
…ld integration

Add Ascend platform scaffolding:
- `device_.h`: `DeviceEnabled<kAscend>` specialization
- `data_type_.h`: `toAclDtype()`, `isIntegerDtype()`
- `common.h`: `buildAclTensor()` with optional transpose
- `workspace_pool_.h`: stream-keyed workspace allocator
- `runtime_.h`: `Runtime<kAscend>` (Malloc, Free, Memcpy, Memset)
- 5 new operator base classes (AddRmsNorm, FlashAttention, Matmul,
  ReshapeAndCache, RotaryEmbedding)

Integrate into CMake build system, Python binding generation (stream +
optional tensor support), and examples runtime API.
…emove missing include

- Wrap `aclrtMemcpy` (5-arg) and `aclrtMemset` (4-arg) in lambdas to
  match the generic 4-arg / 3-arg calling convention used by examples.
- Assert `aclrtMalloc` return value in `WorkspacePool::ensure()`.
- Remove `ascend/gemm/kernel.h` include from `runtime_api.h` (file
  does not exist until the kernels commit).
- Add Ascend GEMM specialization using `aclnnAddmm`/`aclnnBaddbmm`.
- Add `get_npu_stream()` helper and NPU device detection in test utils.
- Add `skip_unsupported_dtype` fixture for Ascend in conftest.
- Update `runtime_api.h` with Ascend backend entry.
The `aclrtMalloc` call was the sole expression inside `assert()`, so it
was compiled away in release builds (NDEBUG). This left the workspace
buffer null, causing `aclnnAddmm` to return ACLNN_ERR_PARAM_NULLPTR
(161001) for any operation that requires workspace (e.g. alpha != 1.0).
`CudaCausalSoftmax` was missing `#include "cuda/runtime_utils.h"`,
causing `RuntimeUtils` to be undefined. Drop `std::forward` from
`Operator::make` nested lambda — NVCC instantiates the body during
SFINAE invocability checks even inside `if constexpr` false branches,
causing template resolution failures. All operator constructors take
parameters by value, so lvalue pass has identical semantics.
Upgrade base image from `nvcr.io/nvidia/pytorch:24.10-py3` (CUDA 12.6)
to `25.12-py3` (CUDA 13.1), aligning CI with the local dev environment.
Restore `std::forward<Args>(args)...` in `Operator::make`, as the NVCC
bug that required dropping it is fixed in the newer toolkit.
`Tensor::Size` (`unsigned long`) to `int64_t` narrowing is an error on
MetaX's clang-based compiler (`-Wc++11-narrowing`).
- Add blank lines between struct/class members per style guide
- Capitalize comments and use backtick syntax for code refs in `matmul.h`
- Move `import re` to module level in `generate_wrappers.py`
- Add blank lines before `for`/`return` per PEP 8 in `generate_wrappers.py`
- Replace `-k npu` with `--devices ascend` in CI config
- Fix `ruff format` violations in `generate_wrappers.py` and `test_gemm.py`.
- Fix `ruff isort` violation: move `import re` into stdlib group.
- Add backticks around identifiers in comments (`numel()`, `operator()`,
  `make()`, `torch_npu`, `uint16`/`uint32`/`uint64`).
- Add missing blank line after `if` block in `skip_unsupported_dtype`.
- Remove `.worktrees/` from project `.gitignore` (belongs in global gitignore).
@zhangyue207 zhangyue207 force-pushed the feat/ascend-framework branch from 80acc8b to 7628b2f Compare April 10, 2026 16:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant