2009年12月24日 星期四

Some issues while running Linux SMP on ARM11MPCore

http://lists.infradead.org/pipermail/linux-arm-kernel/2009-December/006650.html

I'm using ARM11 MPCore with 2 CPU, Linux-2.6.31.1, SMP enabled, L1 enabled, L2 disabled

Under SMP environment, I have observed following issues:

  1. Sometimes, console became extremely slow, print 1 character for 1-2 seconds
    RVDS say that both CPU are idling. kernel seems find because messages response to inserting USB flash is quick and correct.
    fixed

  2. Sometimes, the Linux console halt and canot accept any input.
    RVDS say that both CPU are idling. kernel seems find because messages response to inserting USB flash is quick and correct.
    should be fixed with case 1

  3. Sometimes, the test stop with no reason or some fault like segmantation fault and return to console prompt or login prompt.

  4. Sometimes, the test stop with no reason, but not returning to console prompt. The console can accept input, but no further response, nor prompt.

    RVDS says that one CPU is idling, the other is in IRQ context, at entry-armv.S(676) after __pabt_usr, seems like it's keeps getting prefetch abort.



I can duplicate case 1 by keep inserting a simple test module.
#include <linux/init.h>
#include <linux/module.h>


static int __init MYDRIVER_init(void)
{

printk("%s: \n",__func__);
return 0;
}

static void __exit MYDRIVER_exit(void)
{
printk("%s: \n",__func__);
}

MODULE_AUTHOR("Mac Lin");
MODULE_DESCRIPTION("MYDRIVER");
MODULE_LICENSE("GPL");

module_init(MYDRIVER_init);
module_exit(MYDRIVER_exit);
keep insert and remove modules like below:
module=mydriver;modprobe ${module};rmmod ${module}; (...repeat many times...i have it 30 times..) modprobe ${module};rmmod ${module};
and keep issuing it for, says , 10 times, without waiting the previous command to complete.
then at some point I'll got the case 1.

following command won't do, it just can keep runninng.
module=mydriver;while : ; do modprobe ${module};rmmod ${module};done;


After some tracking, I thought that CONFIG_LOCAL_TIMERS has strange behavior. I disable it, and the situation changed. It's harder to get case 1, but still have some issues. for example, it crash like the following, and it became case 3
[ 57.090000] MYDRIVER_exit:
[ 57.110000] MYDRIVER_init:
[ 57.150000] MYDRIVER_exit:
[ 57.180000] MYDRIVER_init:
[ 57.210000] MYDRIVER_exit:
[ 57.240000] MYDRIVER_init:
[ 57.270000] MYDRIVER_exit:
[ 57.300000] MYDRIVER_init:
[ 57.320000] sh: unhandled page fault (11) at 0x000b7dfc, code 0x017
[ 57.320000] pgd = c78b4000
[ 57.330000] [000b7dfc] *pgd=038f4031, *pte=00000000, *ppte=00000000
[ 57.350000]
[ 57.360000] Pid: 350, comm: sh
[ 57.370000] CPU: 1 Not tainted (2.6.31.1-XXXX1 #53)
[ 57.390000] PC is at 0x40058d04
[ 57.400000] LR is at 0xb7df8
[ 57.400000] pc : [<40058d04>] lr : [<000b7df8>] psr: 60000010
[ 57.400000] sp : bec8b6b8 ip : 0001d020 fp : 00000000
[ 57.440000] r10: 00000000 r9 : bec8b728 r8 : 00000002
[ 57.460000] r7 : 0009c038 r6 : 0001d028 r5 : 4009fe40 r4 : 400a02f8
[ 57.470000] r3 : 00000049 r2 : 0009add8 r1 : 0009add8 r0 : 00000049
[ 57.490000] Flags: nZCv IRQs on FIQs on Mode USER_32 ISA ARM Segment user
[ 57.520000] Control: 00c5787d Table: 078b400a DAC: 00000015
Segmentation fault


Without DCache and CONFIG_LOCAL_TIMERS, I can repeat the above procedure for 216 seconds, then it halted as case 4.

Case 1 also exists.

It means without DCache and CONFIG_LOCAL_TIMERS cannot avoid them, but only mitigate a little.

BTW, I have done a quick port to linux-2.6.33-rc1, branch master, based on commit f2d9a06. With DCache and CONFIG_LOCAL_TIMERS, I have seen case 1, which means this issue is not fixed yet.

Without SMP, I haven't seen such issue yet.

So currently all the clues led to SMP.

http://lists.infradead.org/pipermail/linux-arm-kernel/2010-January/006901.html

(......................)
(case 1 and 2) These two sounds like a problem with interrupts - userspace console IO is interrupt driven, whereas kernel console IO is not.
(......................)
It could be something to do with write allocate caches - we don't support these particularly well in the kernel, and I wouldn't be surprised if you've found some problem there.

The fact that it only happens in SMP mode rather points at that, because that's one of the few hardware configurations which does have write allocate caches. To confirm this, we need someone who can run your tests on a UP platform which has write allocate caches...


http://lists.infradead.org/pipermail/linux-arm-kernel/2010-January/006945.html
http://lists.infradead.org/pipermail/linux-arm-kernel/2010-January/006955.html
Neither without SMP nor SMP with maxcpus=1 have the same behavior.


Fix for case 1 and case 2
http://lists.infradead.org/pipermail/linux-arm-kernel/2010-January/007052.html
Thanks for Russell's advice, after some tracing, I found that my IER (Interrupt Enable Register) of the serial port is 0 under case 1!!

Case 2 is actually the same with case 1. Case 1 would come first, if I don't keep input things and let it finish its slow printing, it would then become case 2.

UART_BUG_THRE are detected and enabled on my platform, causing serial8250_backup_timeout to be used.

There are many places that do ( get IER, clear IER, restore IER ), like serial8250_console_write called by printk, and serial8250_backup_timeout. serial8250_backup_timeout is not protected by spinlock, causing the race condition, and result in wrong IER value.

Following patch fix this issue. Case 3 and Case 4 are still often seen, but not case 1 and case 2.
diff --git a/kernels/linux-2.6.31.1-X/drivers/serial/8250.c b/kernels/linux-2.6.31.1-X/drivers/serial/8250.c
index 288a0e4..55602c3 100644
--- a/kernels/linux-2.6.31.1-cavm1/drivers/serial/8250.c
+++ b/kernels/linux-2.6.31.1-cavm1/drivers/serial/8250.c
@@ -1752,6 +1758,8 @@ static void serial8250_backup_timeout(unsigned long data)
unsigned int iir, ier = 0, lsr;
unsigned long flags;

+
+ spin_lock_irqsave(&up->port.lock, flags);
/*
* Must disable interrupts or else we risk racing with the interrupt
* based handler.
@@ -1769,10 +1777,8 @@ static void serial8250_backup_timeout(unsigned long data)
* the "Diva" UART used on the management processor on many HP
* ia64 and parisc boxes.
*/
- spin_lock_irqsave(&up->port.lock, flags);
lsr = serial_in(up, UART_LSR);
up->lsr_saved_flags |= lsr & LSR_SAVE_FLAGS;
- spin_unlock_irqrestore(&up->port.lock, flags);
if ((iir & UART_IIR_NO_INT) && (up->ier & UART_IER_THRI) &&
(!uart_circ_empty(&up->port.info->xmit) || up->port.x_char) &&
(lsr & UART_LSR_THRE)) {
@@ -1780,12 +1786,14 @@ static void serial8250_backup_timeout(unsigned long data)
iir |= UART_IIR_THRI;
}

- if (!(iir & UART_IIR_NO_INT))
- serial8250_handle_port(up);
-
if (is_real_interrupt(up->port.irq))
serial_out(up, UART_IER, ier);

+ spin_unlock_irqrestore(&up->port.lock, flags);
+
+ if (!(iir & UART_IIR_NO_INT))
+ serial8250_handle_port(up);
+
/* Standard timer interval plus 0.2s to keep the port running */
mod_timer(&up->timer,
jiffies + poll_timeout(up->port.timeout) + HZ / 5);


SMP issues with 8250.c‏
http://old.nabble.com/SMP-issues-with-8250.c%E2%80%8F-to27090634.html
http://www.spinics.net/lists/linux-serial/msg02106.html

沒有留言: