2010年2月9日 星期二

PIO device with cache coherency

USB mass storage and ARM cache coherency
http://lkml.org/lkml/2010/1/29/151


(1st issue: I/D cache coherency)

I've been trying for some time to use a rootfs (ext2) on a USB memory
stick on ARM platforms but without any success. The USB HCD driver is
ISP1760 which doesn't use DMA.

ARM has a Harvard cache architecture and what I get is incoherency
between the I and D caches
. The CPU I'm using (ARM11MPCore) has PIPT
caches with D-cache lines allocation on write.

Basically, when user space tries to execute from a new page, it faults
and the data is requested via the VFS layer, SCSI block device and USB
mass storage from the ISP1760 driver. The page is then mapped into user
space and update_mmu_cache() called.

However, since the driver is PIO, the data copied from the USB device
into RAM gets stuck in the D-cache. On the above page requesting path
there is no call to flush_dcache_page() to handle D-cache maintenance
(for DMA drivers, that's handled by the DMA API).

Since the USB mass storage code has the information about the USB driver
capabilities (DMA or PIO), it looks like the best place to call
flush_dcache_page(). But I got lost in the SCSI emulation and all my
attempts failed to get a working rootfs.

Adding flush_dcache_page() higher up in mpage_end_io_read() solves the
problem but that's not the correct fix as it has wider implications and
it's not needed for DMA-capable devices.

(.................)

isp1760: Flush the D-cache for the pipe-in transfer buffers

From: Catalin Marinas <catalin.marinas@arm.com>

When the HDC driver writes the data to the transfer buffers it pollutes
the D-cache (unlike DMA drivers where the device writes the data). If
the corresponding pages get mapped into user space, there are no
additional cache flushing operations performed and this causes random
user space faults on architectures with separate I and D caches
(Harvard) or those with aliasing D-cache.

(.................)

The PIO-MMC drivers walk through a scatter list via sg_miter_start() and
friends. Those helpers take care of this automaticly.
(Actually I just ran into a issue seems related to it. PIO SDHC

(.................)

My issues is with both I-D coherency and D-cache aliasing caused by
pages mapped in both user and kernel space (with different colours). The
flush_dcache_page() call should target both cases.

(.................)

We could of course flush the caches every time we get a page fault but
that's far from optimal, especially since DMA-capable drivers to do not
pollute the D-cache and don't need this extra flushing. Note that the
recent ARM processors have PIPT caches but separate for I and D and it's
the PIO drivers that pollute the D-cache.

The kernel API provides flush_dcache_page() to be called every time the
kernel writes to a page cache page. This is further optimised for
working in pair with update_mmu_cache() to delay the flushing until the
actual page is mapped into user space and this latter function is called
(which in general is not a cache maintenance function).

The problem with some PIO drivers and a filesystems like ext2 is that
there is no call to flush_dcache_page() when getting data into a page
cache page. Since the page isn't marked as dirty (PG_arch_1), a
subsequent call to update_mmu_cache() as a result of a page fault
doesn't flush the caches.

(.................)
(2nd issue, unnecessory DMA cache operation for PIO cause corruption, only on ARMv7 with speculative prefetch)

> This seems wrong to me. Buffers for control transfers may be transfered
> by DMA, so the caches must be flushed on architectures whose caches
> are not coherent with respect to DMA.
Indeed and that's what I mentioned in the comment. But we shouldn't have dma
cache maintenance operations done for the buffers which would use pio based transfer.
> Would you care to elaborate on the exact nature of the bug you are fixing?
On the OMAP4 (ARM cortex-a9) platform, the enumeration fails because control
transfer buffers are corrupted. On our platform, we use PIO mode for control
transfers and DMA for bulk transfers.

The current stack performs dma cache maintenance even for the PIO transfers
which leads to the corruption issue. The control buffers are handled by CPU
and they already coherent from CPU point of view.

(.................)


On map, buffers are cleaned if they're being used for DMA_TO_DEVICE and
DMA_BIDIRECTIONAL, or invalidated in the case of DMA_FROM_DEVICE.

However, because ARM CPUs can now speculatively prefetch, just leaving it
at that results in corruption of buffers used for DMA. So we have to
invalidate DMA_FROM_DEVICE and DMA_BIDIRECTIONAL buffers on unmap to
ensure coherency with DMA operations.

If the CPU writes to a DMA_FROM_DEVICE buffer between map and unmap, the
writes can sit in the cache, and on unmap, they will be discarded.

Cleaning the cache on unmap is not an option; that too can lead to DMA
buffer corruption in the DMA case.

USB and associated host driver must abide by the DMA API buffer
ownership rules otherwise the result will be data corruption; either
that or USB/host driver people need to have a discussion with the
DMA API authors to remove this sensible "restriction".






[PATCH] isp1760: Flush the D-cache for the pipe-in transfer buffers
http://lkml.org/lkml/2010/2/2/142


[RFC PATCH 0/4] PIO drivers and cache coherency
http://www.spinics.net/lists/linux-arch/msg09295.html

[RFC PATCH 1/4] pio-mapping: Add generic support for PIO mapping API
http://www.spinics.net/lists/linux-arch/msg09296.html

[RFC PATCH 2/4] pio-mapping: Add ARM support for the PIO mapping API
http://www.spinics.net/lists/linux-arch/msg09297.html

[RFC PATCH 3/4] pio-mapping: Use the PIO mapping API in libata-sff.c
http://www.spinics.net/lists/linux-arch/msg09298.html

[RFC PATCH 4/4] pio-mapping: Use the PIO mapping API in the ISP1760 HCD driver
http://www.spinics.net/lists/linux-arch/msg09299.html


swiotlb

Kernel development
The current 2.6 kernel is 2.6.7;...
http://lwn.net/Articles/89961/

DMA issues, part 2
[Posted June 30, 2004 by corbet]
http://lwn.net/Articles/91870/


PG_arch_1
http://www.takatan.net/lxr/source/include/asm-arm/cacheflush.h#L97

沒有留言: