An ASR33 Teletype - origin of the abbreviation tty.
We meet several kinds of objects (character devices, tty drivers,
line disciplines). Each registers itself at kernel initialization
time (or module insertion time), and can afterwards be found
when an open()
is done.
A character device announces its existence by calling
register_chrdev()
. The call
register_chrdev(major, name, fops);stores the given
name
(a string) and fops
(a struct file_operations *) in the entry of the array
chrdevs[]
indexed by the integer major
,
the major device number of the device.
(Devices have a number, the device number, a combination of major and minor device number. Traditionally, the major device number gives the kind of device, and the minor device number is some kind of unit number. However, there are no rules - it is best to consider a device number a cookie, without known structure.)
This stored entry is used again when the device is opened:
The filesystem recognizes that the file that is being opened
is a special device file, and invokes
init_special_inode()
.
This routine does
void init_special_inode(struct inode *inode, umode_t mode, dev_t rdev) { inode->i_mode = mode; if (S_ISCHR(mode)) { inode->i_fop = &def_chr_fops; inode->i_rdev = to_kdev_t(rdev); inode->i_cdev = cdget(rdev); } else ... }Here
to_kdev_t()
converts the user mode version of the
device number to the kernel version of the device number.
The cdget()
returns the struct char_device for this
major number. It finds it using a hash table, and if we did not
have it already, a new one is allocated. In all cases, the
reference count of this struct is increased by one.
The struct looks like
struct char_device { struct list_head hash; atomic_t count; dev_t dev; atomic_t openers; struct semaphore sem; };Here
hash
is a link in the chain of devices with the same hash,
count
is the number of references - each cdget()
increases and each cdput()
decreases this by one,
and if it becomes zero, the struct is removed from the hash chain
and freed. The field dev
stores the device number, the
only thing we know about this device.
The fields openers
and sem
are unused.
Access to the hash table is protected by the cdev_lock
spinlock.
Finally the last item from init_special_inode()
:
struct file_operations def_chr_fops = { .open = chrdev_open, };That is, we cannot do anything with the character device except opening it, and when we do
chrdev_open()
is called.
What is the use of this struct? After removing the unused fields
openers
and sem
we see that we just have a struct
in a hash chain and a reference count. It has no function at all,
and all related code can be deleted (from 2.5.59).
The system call routine sys_open()
calls filp_open()
, and that calls dentry_open()
,
which does
f->f_op = fops_get(inode->i_fop);In other words, the file
f_op
is a copy of the inode
i_fop
. (This fops_get()
returns its argument,
but increments a reference count in case these file operations
live in a module.)
Finally, dentry_open()
calls the inode open routine if
there is one:
if (f->f_op && f->f_op->open) f->f_op->open(inode,f);Thus, it is here that
chrdev_open()
is called.
int chrdev_open(struct inode *inode, struct file *filp) { int ret = -ENODEV; filp->f_op = get_chrfops(major(inode->i_rdev), minor(inode->i_rdev)); if (filp->f_op) { ret = 0; if (filp->f_op->open) ret = ret = filp->f_op->open(inode,filp); } return ret; }And the routine
get_chrfops()
retrieves the
struct file operations * that was registered:
struct file_operations *get_chrfops(unsigned int major, unsigned int minor) { return fops_get(chrdevs[major].fops); }(The actual routine checks whether the device did register already, and if not does a
request_module("char-major-N")
first,
where N is the major number.)
We see that the inode fops remains unchanged, so that its open
still points to chrdev_open()
, but the file fops is changed
and now points to what the device registered.
Let us focus on /dev/tty1
, the first virtual console.
Most code lives in drivers/char
, in the files
tty_io.c
and n_tty.c
and vt.c
.
A tty driver announces its existence by calling tty_register_driver()
.
This call does a register_chrdev()
(with tty_fops
)
and hangs the driver in the chain tty_drivers
.
That chain is used by get_tty_driver()
, a routine that
given a device number finds the tty driver that handles the device
with that number.
This routine is used in two places: in fs/char_dev.c:get_chrfops()
and in tty_io.c:init_dev()
, called from tty_open
.
The latter use was expected, but what is this strange first use?
#define is_a_tty_dev(ma) (ma == TTY_MAJOR || ma == TTYAUX_MAJOR) #define need_serial(ma,mi) (get_tty_driver(mk_kdev(ma,mi)) == NULL) static struct file_operations * get_chrfops(unsigned int major, unsigned int minor) { ... ret = fops_get(chrdevs[major].fops); if (ret && is_a_tty_dev(major) && need_serial(major,minor)) { fops_put(ret); ret = NULL; } if (!ret) { char name[20]; sprintf(name, "char-major-%d", major); request_module(name); ret = fops_get(chrdevs[major].fops); } return ret; }
The idea here is that majors 4 and 5 (TTY_MAJOR and TTYAUX_MAJOR)
may be served by several modules.
Indeed, /dev/tty1
has major,minor 4,1 and is a virtual
console, while /dev/ttyS1
has major,minor 4,65 and is
a serial line. Thus, in drivers/serial/core.c:uart_register_driver()
we see a call of tty_register_driver()
, and this former routine
is called, e.g., to register serial8250_reg
, defined as
struct uart_driver serial8250_reg = { .owner = THIS_MODULE, .driver_name = "serial", .dev_name = "ttyS%d", .major = TTY_MAJOR, .minor = 64, .nr = UART_NR, .cons = SERIAL8250_CONSOLE, };while
vt.c:vty_init()
calls tty_register_driver()
to register console_driver
with major = TTY_MAJOR
and minor_start = 1
.
As we saw above, opening a character device ends up with calling
the open routine from the struct file_operations registered by the
device. In the case of a tty, the open routine in tty_fops
is tty_open
.
The routine tty_open
is long and messy, with a lot of
special purpose code for controlling ttys, for pseudottys, etc.
In the ordinary case the essential part is
tty_open(struct inode *inode, struct file *filp) { struct tty_struct *tty; kdev_t device = inode->i_rdev; init_dev(device, &tty); file->private_data = tty; tty->driver.open(tty,file); }Thus, first of all, we create a tty_struct. Next, a pointer to this tty_struct is stored in the
private_data
field of the file struct, so that we
can find it later, for example in tty_read()
:
tty_read(struct file *file, char *buf, size_t count, ...) { struct tty_struct *tty = file->private_data; (tty->ldisc.read)(tty,file,buf,count); }Finally we call the open routine of the driver. The field
tty->driver
was set in init_dev()
:
init_dev(kdev_t device, struct tty_struct **ret_tty) { struct tty_driver *driver = get_tty_driver(device); struct tty_struct *tty = alloc_tty_struct(); initialize_tty_struct(tty); tty->device = device; tty->driver = *driver; (tty->ldisc.open)(tty); *ret_tty = tty; }Note that the entire struct tty_driver is copied in the assignment, so that individual fields can be changed without damaging the struct that was registered. However, this is never done, so having a copy is a waste of memory.
The line discipline gives the protocol on the serial line.
Each line discipline has a number, and the normal one is called
N_TTY (0).
Line disciplines are registered by tty_register_ldisc()
,
by storing a struct tty_ldisc
in the array ldiscs[]
(where the index is the line discipline number).
The normal discipline is registered by console_init()
,
as first among the registered disciplines:
void __init console_init(void) { memset(ldiscs, 0, sizeof(ldiscs)); tty_register_ldisc(N_TTY, &tty_ldisc_N_TTY); ... }
The call
initialize_tty_struct(tty);we saw in
init_dev()
, does among other things
tty->ldisc = ldiscs[N_TTY];
Thus, when tty->ldisc.open
is called, it is the open field
of the struct tty_ldisc_N_TTY
. This struct lives in
n_tty.c
and its open field is n_tty_open
.
After this preparation, finally tty->driver.open(tty,file)
is called. Now that we had /dev/tty1
in mind, that is, one of
the virtual consoles, let us see what routine this is.
In vt.c:vty_init()
we see
console_driver.open = con_open; ... tty_register_driver(&console_driver);So, our open routine is
con_open()
, an amusing open routine.
It creates a virtual console if there wasn't one. So, if you have
8 virtual consoles but open /dev/tty23
then you have 9.
If you have lots of unused consoles and want to free the memory
they take, use the command deallocvt
.
Exercise Which keystroke changes to console 23?
The system call sys_read()
is found in fs/read_write.c
.
It calls vfs_read()
, and this calls file->f_op->read()
.
In our case, this is the read routine of tty_fops
, which
unsurprisingly is tty_read
. And above we saw that this
calls tty->ldisc.read
, which is the read field of
tty_ldisc_N_TTY
, called read_chan
.
The code is in n_tty.c
. It downs the semaphore
tty->atomic_read
, hangs itself in the wait queue
tty->read_wait
of waiters for input, goes to sleep
if no input is available, copies input to the user buffer,
ups the semaphore tty->atomic_read
and returns.
(Reality is much more complicated. Try to read the code.)
So, hopefully, somebody will fill the input buffer. Who?
Keyboard interrupts arrive at
input/keyboard/atkbd.c:atkbd_interrupt()
.
It handles the keyboard protocol and converts scancode to
keycode. Then input_report_key()
is called, a define
for input_event()
, and this routine offers the event
to all registered handlers.
Now keyboard.c:kbd_init()
registers kbd_handler
,
and the result is that keyboard keystrokes will be handled by
keyboard.c:kbd_event()
, which calls kbd_keycode()
.
Here keyboard raw, mediumraw, xlate and unicode modes are handled,
as is the magic sysrequest key. Scancodes have already been converted
to keycodes, here we convert back (yecch) for raw mode, leave things
for mediumraw mode, or further convert keycodes to characters using
the keymap (set by the utility loadkeys
). Finally we call
put_queue(vc, byte);with the resulting bytes. Here
vc
is the foreground virtual console.
Now
void put_queue(struct vc_data *vc, int ch) { struct tty_struct *tty = vc->vc_tty; tty_insert_flip_char(tty, ch, 0); con_schedule_flip(tty); }that is,
put_queue()
retrieves vc->vc_tty
that was set by con_open()
, and puts its stuff in the
flip buffer. Then the work of transporting this to the read_buffer
is scheduled. (In tty raw mode that is a plain copy, but in canonical
mode we must react to special characters: the erase character erases,
the interrupt character sends an interrupt, etc.)
And when the transporting has been done, the bytes are ready to
be read by a read() call.
Here things are entirely analogous. The system call sys_write()
calls vfs_write()
, and this calls file->f_op->write()
.
In our case, this is the write routine of tty_fops
, which
is tty_write
.
It does do_tty_write(tty->ldisc.write, ...)
which downs
the semaphore tty->atomic_write
, possibly splits up the write
into smaller chunks, calls its first argument and ups the semaphore again.
The write routine here is the write field of tty_ldisc_N_TTY
,
called write_chan
. The code is in n_tty.c
.
It hangs itself in the wait queue tty->write_wait
of waiters for room for output, tries to write by calling
tty->driver.write
, and if that fails to write everything
goes to sleep.
Now our driver was console_driver
with write routine
con_write
that calls do_con_write
.
Here very obscure things are done to handle escape sequences
(cursor movement, screen colours, scrolling, etc. etc.),
but in the normal case we see
scr_writew((attr << 8) + byte, screenpos);that actually writes the character and the (foreground / background / intensity) attributes. All very messy code - not a joy to behold.
Raw devices are character devices that can be bound to block devices.
I/O from/to raw devices bypasses the block caches.
Whether that is desirable depends on the application.
Usually it is undesirable - there are all kinds of issues
with raw devices. A main problem is that of coherency -
the block device should not also be accessed directly.
An annoyance is that I/O buffers must be aligned.
Very few standard programs do this.
The code for the raw device does set_blocksize()
,
so that bad things happen if the device was open already
and using a different blocksize. Really, if raw is used
it must be the only access path to the block device.
The block device belonging to a raw device is noted down in
the private_data
field of the file struct.
There are two ioctls: RAW_SETBIND
and RAW_GETBIND
.
The former connects a given raw device to a block device specified
by major, minor. The latter reports on a connection.
The file descriptor needed for the ioctl is that of the control raw
device, with minor number zero.
Unbinding is done by binding to major,minor = 0,0.
Binding is done by setting the i_mapping
field of the raw device
inode to the i_mapping
field of the block device. After rebinding
this will crash certain kernels because the inode for the block device
may have gone away.
For security purposes Linux has the devices /dev/random
and /dev/urandom
. The former produces cryptographically
strong bits, but may block when no entropy is available.
The latter uses bits from the former when available, and
a strong random generator otherwise, and does not block.
Exercise Try
dd if=/dev/urandom of=/dev/null bs=1024 count=1000
and immediately afterwards
dd if=/dev/random of=/dev/null bs=1024 count=1
.
The former produces (more than) a megabyte of pseudorandom bits
in less than a second. Probably this will have exhausted
the entropy pool, and the latter will block until some randomness
arrives. Move the mouse a little.
Randomness is needed in-kernel, e.g. for TCP sequence numbers - these
must be hard to predict by an attacker to prevent spoofing -, and in
user space for passwords or secret keys used to protect something -
say the key for the .Xauthority
file to protect access to
the X server. The random character device is a standard part of
the kernel, not something one selects with a config option.
The random device is a subdevice of the mem (for memory) device.
The character device major 1 has subdevices mem
, kmem
,
null
, port
, zero
, full
,
random
, urandom
, kmsg
(for minors 1,2,3,4,5,7,8,9,11 - long ago minor 6 was /dev/core
,
while minor 10 was reserved for /dev/aio
but when aio was
implemented it was done differently).
Thus, the registration is found in drivers/char/mem.c
Randomness is stored in the entropy_store
, which has
an associated variable entropy_count
counting available
random bits. The routine random_read()
sees whether we
have some bits, and if so returns them, and otherwise sleeps.
The routine urandom_read()
just extracts some bits.
So the question is how to obtain randomness. Something nobody
can predict even when all running software is known.
The random device uses four sources, namely the routines
add_
X_randomness
, for
X = keyboard
, mouse
, disk
,
interrupt
. The keyboard, and the mouse, and each IRQ,
and each disk have an associated structure
struct timer_rand_state { __u32 last_time; __s32 last_delta,last_delta2; int dont_count_entropy:1; };that remembers when we last did something, and the first and second order differences in the sequence of points in time. The routines
add_keyboard_randomness()
etc. call
add_timer_randomness()
, and the current time and the
value contributed by the routine (keyboard scancode, mouse data, etc.)
are mixed into the pool. In order to estimate the amount of entropy
added, only the time is used, not the scancode (etc.) data.