summaryrefslogtreecommitdiff
path: root/tools/perf/scripts/python/export-to-sqlite.py
diff options
context:
space:
mode:
authorJason Gunthorpe <jgg@nvidia.com>2025-11-04 14:30:10 -0400
committerJoerg Roedel <joerg.roedel@amd.com>2025-11-05 09:07:14 +0100
commitaef5de756ea871ab44e3a1a87be6c944e6587c51 (patch)
tree8a380a698abe62ef9cf53bbce120a564a38bf94f /tools/perf/scripts/python/export-to-sqlite.py
parente93d5945ed5bb086431e83eed7ab98b6c058cc0b (diff)
iommupt: Add the x86 64 bit page table format
This is used by x86 CPUs and can be used in AMD/VT-d x86 IOMMUs. When a x86 IOMMU is running SVA the MM will be using this format. This implementation follows the AMD v2 io-pgtable version. There is nothing remarkable here, the format can have 4 or 5 levels and limited support for different page sizes. No contiguous pages support. x86 uses a sign extension mechanism where the top bits of the VA must match the sign bit. The core code supports this through PT_FEAT_SIGN_EXTEND which creates and upper and lower VA range. All the new operations will work correctly in both spaces, however currently there is no way to report the upper space to other layers. Future patches can improve that. In principle this can support 3 page tables levels matching the 32 bit PAE table format, but no iommu driver needs this. The focus is on the modern 64 bit 4 and 5 level formats. Comparing the performance of several operations to the existing version: iommu_map() pgsz ,avg new,old ns, min new,old ns , min % (+ve is better) 2^12, 71,61 , 66,58 , -13.13 2^21, 66,60 , 61,55 , -10.10 2^30, 59,56 , 56,54 , -3.03 256*2^12, 392,1360 , 345,1289 , 73.73 256*2^21, 383,1159 , 335,1145 , 70.70 256*2^30, 378,965 , 331,892 , 62.62 iommu_unmap() pgsz ,avg new,old ns, min new,old ns , min % (+ve is better) 2^12, 77,71 , 73,68 , -7.07 2^21, 76,70 , 70,66 , -6.06 2^30, 69,66 , 66,63 , -4.04 256*2^12, 225,899 , 210,870 , 75.75 256*2^21, 262,722 , 248,710 , 65.65 256*2^30, 251,643 , 244,634 , 61.61 The small -ve values in the iommu_unmap() are due to the core code calling iommu_pgsize() before invoking the domain op. This is unncessary with this implementation. Future work optimizes this and gets to 2%, 4%, 3%. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Tested-by: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Diffstat (limited to 'tools/perf/scripts/python/export-to-sqlite.py')
0 files changed, 0 insertions, 0 deletions