LogoLogo
  • 简介
  • 引导
    • 从引导加载程序内核
    • 在内核安装代码的第一步
    • 视频模式初始化和转换到保护模式
    • 过渡到 64 位模式
    • 内核解压缩
  • 初始化
    • 内核解压之后的首要步骤
    • 早期的中断和异常控制
    • 在到达内核入口之前最后的准备
    • 内核入口 - start_kernel
    • 体系架构初始化
    • 进一步初始化指定体系架构
    • 最后对指定体系架构初始化
    • 调度器初始化
    • RCU 初始化
    • 初始化结束
  • 中断
    • 中断和中断处理第一部分
    • 深入 Linux 内核中的中断
    • 初步中断处理
    • 中断处理
    • 异常处理的实现
    • 处理不可屏蔽中断
    • 深入外部硬件中断
    • IRQs的非早期初始化
    • Softirq, Tasklets and Workqueues
    • 最后一部分
  • 系统调用
    • 系统调用概念简介
    • Linux 内核如何处理系统调用
    • vsyscall and vDSO
    • Linux 内核如何运行程序
    • open 系统调用的实现
    • Linux 资源限制
  • 定时器和时钟管理
    • 简介
    • 时钟源框架简介
    • The tick broadcast framework and dyntick
    • 定时器介绍
    • Clockevents 框架简介
    • x86 相关的时钟源
    • Linux 内核中与时钟相关的系统调用
  • 同步原语
    • 自旋锁简介
    • 队列自旋锁
    • 信号量
    • 互斥锁
    • 读者/写者信号量
    • 顺序锁
    • RCU
    • Lockdep
  • 内存管理
    • 内存块
    • 固定映射地址和 ioremap
    • kmemcheck
  • 控制组
    • 控制组简介
  • SMP
  • 概念
    • 每个 CPU 的变量
    • CPU 掩码
    • initcall 机制
    • Linux 内核的通知链
  • Linux 内核中的数据结构
    • 双向链表
    • 基数树
    • 位数组
  • 理论
    • 分页
    • ELF 文件格式
    • 內联汇编
    • CPUID
    • MSR
  • Initial ram disk
  • 杂项
    • Linux 内核开发
    • 内核编译方法
    • 链接器
    • 用户空间的程序启动过程
    • 书写并提交你第一个内核补丁
  • 内核数据结构
    • 中断描述符表
  • 有帮助的链接
  • 贡献者
由 GitBook 提供支持
在本页
  • Introduction
  • Sequential lock
  • Sequential lock API
  • Conclusion
  • Links
  1. 同步原语

顺序锁

上一页读者/写者信号量下一页RCU

最后更新于1年前

Introduction

This is the sixth part of the chapter which describes in the Linux kernel and in the previous parts we finished to consider different synchronization primitives. We will continue to learn synchronization primitives in this part and start to consider a similar synchronization primitive which can be used to avoid the writer starvation problem. The name of this synchronization primitive is - seqlock or sequential locks.

We know from the previous that is a special lock mechanism which allows concurrent access for read-only operations, but an exclusive lock is needed for writing or modifying data. As we may guess, it may lead to a problem which is called writer starvation. In other words, a writer process can't acquire a lock as long as at least one reader process which acquired a lock holds it. So, in the situation when contention is high, it will lead to situation when a writer process which wants to acquire a lock will wait for it for a long time.

The seqlock synchronization primitive can help solve this problem.

As in all previous parts of this , we will try to consider this synchronization primitive from the theoretical side and only than we will consider provided by the Linux kernel to manipulate the seqlocks.

So, let's start.

Sequential lock

So, what is a seqlock synchronization primitive and how does it work? Let's try to answer these questions in this paragraph. Actually sequential locks were introduced in the Linux kernel 2.6.x. Main point of this synchronization primitive is to provide fast and lock-free access to shared resources. Since the heart of sequential lock synchronization primitive is synchronization primitive, sequential locks work in situations where the protected resources are small and simple. Additionally write access must be rare and also should be fast.

Work of this synchronization primitive is based on the sequence of events counter. Actually a sequential lock allows free access to a resource for readers, but each reader must check existence of conflicts with a writer. This synchronization primitive introduces a special counter. The main algorithm of work of sequential locks is simple: Each writer which acquired a sequential lock increments this counter and additionally acquires a . When this writer finishes, it will release the acquired spinlock to give access to other writers and increment the counter of a sequential lock again.

Read only access works on the following principle, it gets the value of a sequential lock counter before it will enter into and compares it with the value of the same sequential lock counter at the exit of critical section. If their values are equal, this means that there weren't writers for this period. If their values are not equal, this means that a writer has incremented the counter during the . This conflict means that reading of protected data must be repeated.

That's all. As we may see principle of work of sequential locks is simple.

unsigned int seq_counter_value;

do {
    seq_counter_value = get_seq_counter_val(&the_lock);
    //
    // do as we want here
    //
} while (__retry__);

Sequential lock API

First of all we may see that the a sequential lock mechanism is represented by the following type:

typedef struct {
	struct seqcount seqcount;
	spinlock_t lock;
} seqlock_t;
typedef struct seqcount {
	unsigned sequence;
#ifdef CONFIG_DEBUG_LOCK_ALLOC
	struct lockdep_map dep_map;
#endif
} seqcount_t;

We saw in the previous parts that often the Linux kernel provides two approaches to execute initialization of the given synchronization primitive. The same situation with the seqlock_t structure. These approaches allows to initialize a seqlock_t in two following:

  • statically;

  • dynamically.

ways. Let's look at the first approach. We are able to initialize a seqlock_t statically with the DEFINE_SEQLOCK macro:

#define DEFINE_SEQLOCK(x) \
		seqlock_t x = __SEQLOCK_UNLOCKED(x)
#define __SEQLOCK_UNLOCKED(lockname)			\
	{						\
		.seqcount = SEQCNT_ZERO(lockname),	\
		.lock =	__SPIN_LOCK_UNLOCKED(lockname)	\
	}

As we may see the, __SEQLOCK_UNLOCKED macro executes initialization of fields of the given seqlock_t structure. The first field is seqcount initialized with the SEQCNT_ZERO macro which expands to the:

#define SEQCNT_ZERO(lockname) { .sequence = 0, SEQCOUNT_DEP_MAP_INIT(lockname)}
#ifdef CONFIG_DEBUG_LOCK_ALLOC
# define SEQCOUNT_DEP_MAP_INIT(lockname) \
    .dep_map = { .name = #lockname } \
    ...
    ...
    ...
#else
# define SEQCOUNT_DEP_MAP_INIT(lockname)
    ...
    ...
    ...
#endif

Let's look at the implementation of this macro:

#define seqlock_init(x)					\
	do {						\
		seqcount_init(&(x)->seqcount);		\
		spin_lock_init(&(x)->lock);		\
	} while (0)

As we may see, the seqlock_init expands into two macros. The first macro seqcount_init takes counter of the given sequential lock and expands to the call of the __seqcount_init function:

# define seqcount_init(s)				\
	do {						\
		static struct lock_class_key __key;	\
		__seqcount_init((s), #s, &__key);	\
	} while (0)

from the same header file. This function

static inline void __seqcount_init(seqcount_t *s, const char *name,
					  struct lock_class_key *key)
{
    lockdep_init_map(&s->dep_map, name, key, 0);
    s->sequence = 0;
}
static inline unsigned read_seqbegin(const seqlock_t *sl);
static inline unsigned read_seqretry(const seqlock_t *sl, unsigned start);
static inline void write_seqlock(seqlock_t *sl);
static inline void write_sequnlock(seqlock_t *sl);
static inline void write_seqlock_irq(seqlock_t *sl);
static inline void write_sequnlock_irq(seqlock_t *sl);
static inline void read_seqlock_excl(seqlock_t *sl)
static inline void read_sequnlock_excl(seqlock_t *sl)

As we may see this function just returns value of the read_seqcount_begin function:

static inline unsigned read_seqbegin(const seqlock_t *sl)
{
	return read_seqcount_begin(&sl->seqcount);
}

In its turn the read_seqcount_begin function calls the raw_read_seqcount_begin function:

static inline unsigned read_seqcount_begin(const seqcount_t *s)
{
	return raw_read_seqcount_begin(s);
}

which just returns value of the sequential lock counter:

static inline unsigned raw_read_seqcount(const seqcount_t *s)
{
	unsigned ret = READ_ONCE(s->sequence);
	smp_rmb();
	return ret;
}

After we have the initial value of the given sequential lock counter and did some stuff, we know from the previous paragraph of this function, that we need to compare it with the current value of the counter the same sequential lock before we will exit from the critical section. We can achieve this by the call of the read_seqretry function. This function takes a sequential lock, start value of the counter and through a chain of functions:

static inline unsigned read_seqretry(const seqlock_t *sl, unsigned start)
{
	return read_seqcount_retry(&sl->seqcount, start);
}

static inline int read_seqcount_retry(const seqcount_t *s, unsigned start)
{
	smp_rmb();
	return __read_seqcount_retry(s, start);
}

it calls the __read_seqcount_retry function:

static inline int __read_seqcount_retry(const seqcount_t *s, unsigned start)
{
	return unlikely(s->sequence != start);
}

which just compares value of the counter of the given sequential lock with the initial value of this counter. If the initial value of the counter which is obtained from read_seqbegin() function is odd, this means that a writer was in the middle of updating the data when our reader began to act. In this case the value of the data can be in inconsistent state, so we need to try to read it again.

u64 get_jiffies_64(void)
{
	unsigned long seq;
	u64 ret;

	do {
		seq = read_seqbegin(&jiffies_lock);
		ret = jiffies_64;
	} while (read_seqretry(&jiffies_lock, seq));
	return ret;
}

Here we just read the value of the counter of the jiffies_lock sequential lock and then we write value of the jiffies_64 system variable to the ret. As here we may see do/while loop, the body of the loop will be executed at least one time. So, as the body of loop was executed, we read and compare the current value of the counter of the jiffies_lock with the initial value. If these values are not equal, execution of the loop will be repeated, else get_jiffies_64 will return its value in ret.

We just saw the first type of readers which do not block writer and other readers. Let's consider second type. It does not update value of a sequential lock counter, but just locks spinlock:

static inline void read_seqlock_excl(seqlock_t *sl)
{
	spin_lock(&sl->lock);
}

So, no one reader or writer can't access protected data. When a reader finishes, the lock must be unlocked with the:

static inline void read_sequnlock_excl(seqlock_t *sl)
{
	spin_unlock(&sl->lock);
}

function.

Now we know how sequential lock work for readers. Let's consider how does writer act when it wants to acquire a sequential lock to modify data. To acquire a sequential lock, writer should use write_seqlock function. If we look at the implementation of this function:

static inline void write_seqlock(seqlock_t *sl)
{
	spin_lock(&sl->lock);
	write_seqcount_begin(&sl->seqcount);
}

We will see that it acquires spinlock to prevent access from other writers and calls the write_seqcount_begin function. This function just increments value of the sequential lock counter:

static inline void raw_write_seqcount_begin(seqcount_t *s)
{
	s->sequence++;
	smp_wmb();
}

When a writer process will finish to modify data, the write_sequnlock function must be called to release a lock and give access to other writers or readers. Let's consider the implementation of the write_sequnlock function. It looks pretty simple:

static inline void write_sequnlock(seqlock_t *sl)
{
	write_seqcount_end(&sl->seqcount);
	spin_unlock(&sl->lock);
}

First of all it just calls write_seqcount_end function to increase value of the counter of the sequential lock again:

static inline void raw_write_seqcount_end(seqcount_t *s)
{
	smp_wmb();
	s->sequence++;
}

and in the end we just call the spin_unlock macro to give access for other readers or writers.

static inline void write_seqlock_irq(seqlock_t *sl)
{
	spin_lock_irq(&sl->lock);
	write_seqcount_begin(&sl->seqcount);
}

static inline void write_sequnlock_irq(seqlock_t *sl)
{
	write_seqcount_end(&sl->seqcount);
	spin_unlock_irq(&sl->lock);
}

As we may see, these functions differ only in the initialization of spinlock. They call spin_lock_irq and spin_unlock_irq instead of spin_lock and spin_unlock.

That's all.

Conclusion

Links

Actually the Linux kernel does not provide get_seq_counter_val() function. Here it is just a stub. Like a __retry__ too. As I already wrote above, we will see actual the for this in the next paragraph of this part.

Ok, now we know what a seqlock synchronization primitive is and how it is represented in the Linux kernel. In this case, we may go ahead and start to look at the which the Linux kernel provides for manipulation of synchronization primitives of this type.

So, now we know a little about sequential lock synchronization primitive from theoretical side, let's look at its implementation in the Linux kernel. All sequential locks are located in the header file.

As we may see the seqlock_t provides two fields. These fields represent a sequential lock counter, description of which we saw above and also a which will protect data from other writers. Note that the seqcount counter represented as seqcount type. The seqcount is structure:

which holds counter of a sequential lock and related field.

As always in previous parts of this , before we will consider an of sequential lock mechanism in the Linux kernel, we need to know how to initialize an instance of seqlock_t.

which is defined in the header file. As we may see, the DEFINE_SEQLOCK macro takes one argument and expands to the definition and initialization of the seqlock_t structure. Initialization occurs with the help of the __SEQLOCK_UNLOCKED macro which is defined in the same source code file. Let's look at the implementation of this macro:

So we just initialize counter of the given sequential lock to zero and additionally we can see related initialization which depends on the state of the CONFIG_DEBUG_LOCK_ALLOC kernel configuration option:

As I already wrote in previous parts of this we will not consider and related stuff in this part. So for now we just skip the SEQCOUNT_DEP_MAP_INIT macro. The second field of the given seqlock_t is lock initialized with the __SPIN_LOCK_UNLOCKED macro which is defined in the header file. We will not consider implementation of this macro here as it just initializes with architecture-specific methods (More about spinlocks you may read in first parts of this ).

We have considered the first way to initialize a sequential lock. Let's consider second way to do the same, but do it dynamically. We can initialize a sequential lock with the seqlock_init macro which is defined in the same header file.

just initializes counter of the given seqcount_t with zero. The second call from the seqlock_init macro is the call of the spin_lock_init macro which we saw in the of this chapter.

So, now we know how to initialize a sequential lock, now let's look at how to use it. The Linux kernel provides following to manipulate sequential locks:

and others. Before we move on to considering the implementation of this , we must know that there actually are two types of readers. The first type of reader never blocks a writer process. In this case writer will not wait for readers. The second type of reader which can lock. In this case, the locking reader will block the writer as it will wait while reader will not release its lock.

First of all let's consider the first type of readers. The read_seqbegin function begins a seq-read .

This is a common pattern in the Linux kernel. For example, you may remember the jiffies concept from the of the chapter. The sequential lock is used to obtain value of jiffies at architecture:

That's all about sequential lock mechanism in the Linux kernel. Of course we did not consider full of this mechanism in this part. But all other functions are based on these which we described here. For example, Linux kernel also provides some safe macros/functions to use sequential lock mechanism in of : write_seqclock_irq and write_sequnlock_irq:

Or for example write_seqlock_irqsave and write_sequnlock_irqrestore functions which are the same but used spin_lock_irqsave and spin_unlock_irqsave macro to use in handlers.

This is the end of the sixth part of the chapter in the Linux kernel. In this part we met with new synchronization primitive which is called - sequential lock. From the theoretical side, this synchronization primitive very similar on a synchronization primitive, but allows to avoid writer-starving issue.

If you have questions or suggestions, feel free to ping me in twitter , drop me or just create .

Please note that English is not my first language and I am really sorry for any inconvenience. If you found any mistakes please send me PR to .

synchronization primitives
readers-writer lock
part
readers-writer lock
book
API
spinlock
spinlock
critical section
critical section
API
API
API
include/linux/seqlock.h
spinlock
lock validator
chapter
API
include/linux/seqlock.h
lock validator
chapter
debugging
lock validator
include/linux/spinlock_types.h
rawspinlock
chapter
include/linux/seqlock.h
first part
API
API
critical section
first part
timers and time management in the Linux kernel
x86_64
API
interrupt handlers
softirq
IRQ
synchronization primitives
readers-writer lock
0xAX
email
issue
linux-insides
synchronization primitives
readers-writer lock
spinlock
critical section
lock validator
debugging
API
x86_64
Timers and time management in the Linux kernel
interrupt handlers
softirq
IRQ
Previous part