diff options
| author | Linus Torvalds <torvalds@linux-foundation.org> | 2023-11-01 06:28:35 -1000 |
|---|---|---|
| committer | Linus Torvalds <torvalds@linux-foundation.org> | 2023-11-01 06:28:35 -1000 |
| commit | 7d461b291e65938f15f56fe58da2303b07578a76 (patch) | |
| tree | 015dd7c2f1743dd70be52787dd9aff33822bc938 /drivers/gpu/drm/amd/amdkfd/kfd_topology.c | |
| parent | 8bc9e6515183935fa0cccaf67455c439afe4982b (diff) | |
| parent | 631808095a82e6b6f8410a95f8b12b8d0d38b161 (diff) | |
Merge tag 'drm-next-2023-10-31-1' of git://anongit.freedesktop.org/drm/drm
Pull drm updates from Dave Airlie:
"Highlights:
- AMD adds some more upcoming HW platforms
- Intel made Meteorlake stable and started adding Lunarlake
- nouveau has a bunch of display rework in prepartion for the NVIDIA
GSP firmware support
- msm adds a7xx support
- habanalabs has finished migration to accel subsystem
Detail summary:
kernel:
- add initial vmemdup-user-array
core:
- fix platform remove() to return void
- drm_file owner updated to reflect owner
- move size calcs to drm buddy allocator
- let GPUVM build as a module
- allow variable number of run-queues in scheduler
edid:
- handle bad h/v sync_end in EDIDs
panfrost:
- add Boris as maintainer
fbdev:
- use fb_ops helpers more
- only allow logo use from fbcon
- rename fb_pgproto to pgprot_framebuffer
- add HPD state to drm_connector_oob_hotplug_event
- convert to fbdev i/o mem helpers
i915:
- Enable meteorlake by default
- Early Xe2 LPD/Lunarlake display enablement
- Rework subplatforms into IP version checks
- GuC based TLB invalidation for Meteorlake
- Display rework for future Xe driver integration
- LNL FBC features
- LNL display feature capability reads
- update recommended fw versions for DG2+
- drop fastboot module parameter
- added deviceid for Arrowlake-S
- drop preproduction workarounds
- don't disable preemption for resets
- cleanup inlines in headers
- PXP firmware loading fix
- Fix sg list lengths
- DSC PPS state readout/verification
- Add more RPL P/U PCI IDs
- Add new DG2-G12 stepping
- DP enhanced framing support to state checker
- Improve shared link bandwidth management
- stop using GEM macros in display code
- refactor related code into display code
- locally enable W=1 warnings
- remove PSR watchdog timers on LNL
amdgpu:
- RAS/FRU EEPROM updatse
- IP discovery updatses
- GC 11.5 support
- DCN 3.5 support
- VPE 6.1 support
- NBIO 7.11 support
- DML2 support
- lots of IP updates
- use flexible arrays for bo list handling
- W=1 fixes
- Enable seamless boot in more cases
- Enable context type property for HDMI
- Rework GPUVM TLB flushing
- VCN IB start/size alignment fixes
amdkfd:
- GC 10/11 fixes
- GC 11.5 support
- use partial migration in GPU faults
radeon:
- W=1 Fixes
- fix some possible buffer overflow/NULL derefs
nouveau:
- update uapi for NO_PREFETCH
- scheduler/fence fixes
- rework suspend/resume for GSP-RM
- rework display in preparation for GSP-RM
habanalabs:
- uapi: expose tsc clock
- uapi: block access to eventfd through control device
- uapi: force dma-buf export to PAGE_SIZE alignments
- complete move to accel subsystem
- move firmware interface include files
- perform hard reset on PCIe AXI drain event
- optimise user interrupt handling
msm:
- DP: use existing helpers for DPCD
- DPU: interrupts reworked
- gpu: a7xx (a730/a740) support
- decouple msm_drv from kms for headless devices
mediatek:
- MT8188 dsi/dp/edp support
- DDP GAMMA - 12 bit LUT support
- connector dynamic selection capability
rockchip:
- rv1126 mipi-dsi/vop support
- add planar formats
ast:
- rename constants
panels:
- Mitsubishi AA084XE01
- JDI LPM102A188A
- LTK050H3148W-CTA6
ivpu:
- power management fixes
qaic:
- add detach slice bo api
komeda:
- add NV12 writeback
tegra:
- support NVSYNC/NHSYNC
- host1x suspend fixes
ili9882t:
- separate into own driver"
* tag 'drm-next-2023-10-31-1' of git://anongit.freedesktop.org/drm/drm: (1803 commits)
drm/amdgpu: Remove unused variables from amdgpu_show_fdinfo
drm/amdgpu: Remove duplicate fdinfo fields
drm/amd/amdgpu: avoid to disable gfxhub interrupt when driver is unloaded
drm/amdgpu: Add EXT_COHERENT support for APU and NUMA systems
drm/amdgpu: Retrieve CE count from ce_count_lo_chip in EccInfo table
drm/amdgpu: Identify data parity error corrected in replay mode
drm/amdgpu: Fix typo in IP discovery parsing
drm/amd/display: fix S/G display enablement
drm/amdxcp: fix amdxcp unloads incompletely
drm/amd/amdgpu: fix the GPU power print error in pm info
drm/amdgpu: Use pcie domain of xcc acpi objects
drm/amd: check num of link levels when update pcie param
drm/amdgpu: Add a read to GFX v9.4.3 ring test
drm/amd/pm: call smu_cmn_get_smc_version in is_mode1_reset_supported.
drm/amdgpu: get RAS poison status from DF v4_6_2
drm/amdgpu: Use discovery table's subrevision
drm/amd/display: 3.2.256
drm/amd/display: add interface to query SubVP status
drm/amd/display: Read before writing Backlight Mode Set Register
drm/amd/display: Disable SYMCLK32_SE RCO on DCN314
...
Diffstat (limited to 'drivers/gpu/drm/amd/amdkfd/kfd_topology.c')
| -rw-r--r-- | drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 44 |
1 files changed, 20 insertions, 24 deletions
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c index 490000992ecd..c1e10f42db28 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c @@ -1533,7 +1533,6 @@ out: /* Helper function. See kfd_fill_gpu_cache_info for parameter description */ static int fill_in_l1_pcache(struct kfd_cache_properties **props_ext, struct kfd_gpu_cache_info *pcache_info, - struct kfd_cu_info *cu_info, int cu_bitmask, int cache_type, unsigned int cu_processor_id, int cu_block) @@ -1595,7 +1594,8 @@ static int fill_in_l1_pcache(struct kfd_cache_properties **props_ext, /* Helper function. See kfd_fill_gpu_cache_info for parameter description */ static int fill_in_l2_l3_pcache(struct kfd_cache_properties **props_ext, struct kfd_gpu_cache_info *pcache_info, - struct kfd_cu_info *cu_info, + struct amdgpu_cu_info *cu_info, + struct amdgpu_gfx_config *gfx_info, int cache_type, unsigned int cu_processor_id, struct kfd_node *knode) { @@ -1606,7 +1606,7 @@ static int fill_in_l2_l3_pcache(struct kfd_cache_properties **props_ext, start = ffs(knode->xcc_mask) - 1; end = start + NUM_XCC(knode->xcc_mask); - cu_sibling_map_mask = cu_info->cu_bitmap[start][0][0]; + cu_sibling_map_mask = cu_info->bitmap[start][0][0]; cu_sibling_map_mask &= ((1 << pcache_info[cache_type].num_cu_shared) - 1); first_active_cu = ffs(cu_sibling_map_mask); @@ -1642,15 +1642,15 @@ static int fill_in_l2_l3_pcache(struct kfd_cache_properties **props_ext, k = 0; for (xcc = start; xcc < end; xcc++) { - for (i = 0; i < cu_info->num_shader_engines; i++) { - for (j = 0; j < cu_info->num_shader_arrays_per_engine; j++) { + for (i = 0; i < gfx_info->max_shader_engines; i++) { + for (j = 0; j < gfx_info->max_sh_per_se; j++) { pcache->sibling_map[k] = (uint8_t)(cu_sibling_map_mask & 0xFF); pcache->sibling_map[k+1] = (uint8_t)((cu_sibling_map_mask >> 8) & 0xFF); pcache->sibling_map[k+2] = (uint8_t)((cu_sibling_map_mask >> 16) & 0xFF); pcache->sibling_map[k+3] = (uint8_t)((cu_sibling_map_mask >> 24) & 0xFF); k += 4; - cu_sibling_map_mask = cu_info->cu_bitmap[xcc][i % 4][j + i / 4]; + cu_sibling_map_mask = cu_info->bitmap[xcc][i % 4][j + i / 4]; cu_sibling_map_mask &= ((1 << pcache_info[cache_type].num_cu_shared) - 1); } } @@ -1675,16 +1675,14 @@ static void kfd_fill_cache_non_crat_info(struct kfd_topology_device *dev, struct unsigned int cu_processor_id; int ret; unsigned int num_cu_shared; - struct kfd_cu_info cu_info; - struct kfd_cu_info *pcu_info; + struct amdgpu_cu_info *cu_info = &kdev->adev->gfx.cu_info; + struct amdgpu_gfx_config *gfx_info = &kdev->adev->gfx.config; int gpu_processor_id; struct kfd_cache_properties *props_ext; int num_of_entries = 0; int num_of_cache_types = 0; struct kfd_gpu_cache_info cache_info[KFD_MAX_CACHE_TYPES]; - amdgpu_amdkfd_get_cu_info(kdev->adev, &cu_info); - pcu_info = &cu_info; gpu_processor_id = dev->node_props.simd_id_base; @@ -1711,12 +1709,12 @@ static void kfd_fill_cache_non_crat_info(struct kfd_topology_device *dev, struct cu_processor_id = gpu_processor_id; if (pcache_info[ct].cache_level == 1) { for (xcc = start; xcc < end; xcc++) { - for (i = 0; i < pcu_info->num_shader_engines; i++) { - for (j = 0; j < pcu_info->num_shader_arrays_per_engine; j++) { - for (k = 0; k < pcu_info->num_cu_per_sh; k += pcache_info[ct].num_cu_shared) { + for (i = 0; i < gfx_info->max_shader_engines; i++) { + for (j = 0; j < gfx_info->max_sh_per_se; j++) { + for (k = 0; k < gfx_info->max_cu_per_sh; k += pcache_info[ct].num_cu_shared) { - ret = fill_in_l1_pcache(&props_ext, pcache_info, pcu_info, - pcu_info->cu_bitmap[xcc][i % 4][j + i / 4], ct, + ret = fill_in_l1_pcache(&props_ext, pcache_info, + cu_info->bitmap[xcc][i % 4][j + i / 4], ct, cu_processor_id, k); if (ret < 0) @@ -1729,9 +1727,9 @@ static void kfd_fill_cache_non_crat_info(struct kfd_topology_device *dev, struct /* Move to next CU block */ num_cu_shared = ((k + pcache_info[ct].num_cu_shared) <= - pcu_info->num_cu_per_sh) ? + gfx_info->max_cu_per_sh) ? pcache_info[ct].num_cu_shared : - (pcu_info->num_cu_per_sh - k); + (gfx_info->max_cu_per_sh - k); cu_processor_id += num_cu_shared; } } @@ -1739,7 +1737,7 @@ static void kfd_fill_cache_non_crat_info(struct kfd_topology_device *dev, struct } } else { ret = fill_in_l2_l3_pcache(&props_ext, pcache_info, - pcu_info, ct, cu_processor_id, kdev); + cu_info, gfx_info, ct, cu_processor_id, kdev); if (ret < 0) break; @@ -1918,10 +1916,11 @@ int kfd_topology_add_device(struct kfd_node *gpu) { uint32_t gpu_id; struct kfd_topology_device *dev; - struct kfd_cu_info cu_info; int res = 0; int i; const char *asic_name = amdgpu_asic_name[gpu->adev->asic_type]; + struct amdgpu_gfx_config *gfx_info = &gpu->adev->gfx.config; + struct amdgpu_cu_info *cu_info = &gpu->adev->gfx.cu_info; gpu_id = kfd_generate_gpu_id(gpu); if (gpu->xcp && !gpu->xcp->ddev) { @@ -1959,9 +1958,6 @@ int kfd_topology_add_device(struct kfd_node *gpu) /* Fill-in additional information that is not available in CRAT but * needed for the topology */ - - amdgpu_amdkfd_get_cu_info(dev->gpu->adev, &cu_info); - for (i = 0; i < KFD_TOPOLOGY_PUBLIC_NAME_SIZE-1; i++) { dev->node_props.name[i] = __tolower(asic_name[i]); if (asic_name[i] == '\0') @@ -1970,7 +1966,7 @@ int kfd_topology_add_device(struct kfd_node *gpu) dev->node_props.name[i] = '\0'; dev->node_props.simd_arrays_per_engine = - cu_info.num_shader_arrays_per_engine; + gfx_info->max_sh_per_se; dev->node_props.gfx_target_version = gpu->kfd->device_info.gfx_target_version; @@ -2051,7 +2047,7 @@ int kfd_topology_add_device(struct kfd_node *gpu) */ if (dev->gpu->adev->asic_type == CHIP_CARRIZO) { dev->node_props.simd_count = - cu_info.simd_per_cu * cu_info.cu_active_number; + cu_info->simd_per_cu * cu_info->number; dev->node_props.max_waves_per_simd = 10; } |
