DENX Software Engineering GmbH
Kirchenstraße 5
82194 Gröbenzell
This document is compiler from various sources. The Embedded Linux Internals part has largely been derived from Detlev Zundel's training material.
The Linux Device Driver part is strongly influenced by the Book "Linux Device Drivers" Third Edition and therefore licenced under the Creative Commons Attribution-NonCommercial-ShareAlike 2.0 License
The newest version of this document can be found at
http://www.denx.de/twiki/bin/view/Training2/GenericOverview.
This section gives an overview of who we are and what services we provide.
Contents of this section:
DENX == two companies:
- DENX Software Engineering GmbH
- provide software engineering services in the area of Embedded and Real-Time Systems
- high level of expertise
- strong focus on Open Source Software, especially Linux, but also FreeBSD, NetBSD, etc.
- We port firmware and operating systems to your hardware and write device drivers and other low-level or hardware-related software components. We develop, tailor and install the base software for your embedded systems and provide solutions to problems as performance optimization, security concepts or tools for automatic software updates. This allows you to put all your resources in the development of your applications. We also provide on-the-job training for engineers who need to ramp up quickly on how to develop software for Embedded Linux systems.
- DENX Computer Systems GmbH
- single source for ready-to-run hardware and software solutions that guarantee a trouble-free start of your projects
- We offer PowerPC evaluation boards, development systems and standard modules with Open Source firmware and Linux pre-installed, of course with full source code for both the U-Boot firmware and the Linux Kernel ("BSP" Board Support Packages). All boards ship with a free CDROM with our Embedded Linux Development Kit. Our BDM/JTAG debuggers interface perfectly to Linux tools (like GDB / DDD debuggers) and provide full MMU support for Linux kernel and device driver debugging.
- DENX Software Engineering:
- Dipl.-Ing. Wolfgang Denk (Managing Director)
- Dipl.-Math. techn. Detlev Zundel (Managing Director)
- Dipl.-Kauffr. Friederike Denk (CFO)
- Dr. Wolfgang Grandegger (Senior Software Engineer)
- Gary Jennejohn (Senior Software Engineer)
- Dipl.-Inf. Markus Klotzbücher (Software Engineer)
- Dipl.-Ing. Stefan Roese (Senior Software Engineer)
- Dipl.-Ing. Heiko Schocher (Software Engineer)
- Dipl.-Ing. Gunnar Larisch (Senior Software Engineer)
- Partner Company in Moscow, Russia: up to 12 engineers
- Partner Company in Krakow, Poland: 3 Engineers
- more developers (freelancers) as required for project
- DENX Computer Systems:
- Dipl.-Kauffr. Friederike Denk (CEO)
- Dipl.-Ing. Wolfgang Denk (CTO)
Customer Projects:
- Alcatel, Strasbourg, France: Webphone, ADSL Router
- Daimler Chrysler, Sindelfingen, Germany: Display Unit for Test Cars
- Daimler Chrysler, Stuttgart, Germany: FlexRay tools
- Dependable Computer Systems, Vienna, Austria: Analyzer for serial data bus (FlexRay)
- Fast TV Server AG, Munich, Germany: Digital Video Recorder (see http://www.tv-server.de/)
- Liebherr Werke, Ehingen, Germany: Main Controller for Mobile Cranes
- Lucent Technologies, Holmdel, New Jersey, USA: PPCBoot support for Optical Cross-Connect Board
- MicroSys Electronics GmbH, Sauerlach, Germany: U-Boot + Linux support for several boards (IP860, CU824, CPU86, PM825, PM826)
- Multidata GmbH, Darmstadt, Germany: ISDN Router with integrated 8 x hub
- Siemens AG, Munich, Germany: Peripheral Controller Unit, Card Controller Module, and Shelf Controller Module for Optical Network Elements
- Siemens AG, Vienna, Austria: Bluetooth Lan Access Point
- Speech Design, Germering, Germany: Tele-Server, Integrated Voicemail System
- TQ Components GmbH, Wessling, Germany: U-Boot + Linux support for all of their PowerPC boards
Public Projects:
What gives Unix users that smug expression?
- (Unix vs. UNIX vs. Linux)
- Unix is a Third System
- Grandfather CTSS (Compatible Time Sharing System)
- Father was MULTICS
- When AT&T backed out of the MULTICS consortium in 1996, the
former MULTICS developers we left with many good ideas
- they also missed the nice (but expensive!) interactive computing
environment but also the community that formed around it
- Ken Thompson had an idea how to build a file system (and he
needed a machine for his SPACETRAVEL game, 75$!)
- the tools to host the game became the core of the later OS! ->
strikingly simialar to todays Unices!
- First OS in a high level language C (1973) (invented by _Dennis
Ritchie_), which evolved together with Unix
- In 1971 Unix was used at Bell Labs patent department for "word
processing" (nroff(1))
- It was usefull, so Ken and Dennis got a new toy:
PDP-11
- In 1974 Unix was introduced to the public
- AT&T was allowed to enter the software business, so they licenced
the source code.
- Soon it was used at university OS classes
- As source was available, Unix was improved and improvements were merged back again
- Unix evolved fast and diversified. ->
AT&T Unix, BSD (Berkeley Unix), Ultrix (DEC), AIX (IBM),
Solaris (Sun), HP-UX (HP), Xenix (Microsoft),
etc.
Unix family tree
- In 1983 AT&T was allowed to sell software: Unix was commercialized
- Community engaged in Unix wars (Sys V vs. BSD Unix)
- Microsoft made its famous deal with IBM
- Richard Stallmans story: the printer incident
- In September 1983 Stallman started the GNU project to build a complete free operating system.
- As it was a voluntary project, he chose Unix as a model so coordination of contributers would be easier.
- Stallman wrote Emacs, GCC and GDB as a foundation.
- Many volunteers contributed GPLd rewrites of Unix utilities (
make, vi, sed, awk, etc.)
- Most of the GNU project was done in 1992 - last part missing: a kernel!
- The FSF started work on The HURD, a micro-kernel architecture based on Mach kernel.
- In 70s software was not a product. Usually complete OS was distributed with hardware
- Advent of microcomputers opened a "market" for software
- How to treat software: Is it a Formula or Prose?
- Richard Stallman realized the implications of copyright
- Using the machinery for his purpose, he inverted the copyright
-> GPL also called Copyleft
- 21 year old Linus Torvalds starts "educational" project in 1991
- Tanenbaum's Minix served as a Model, but no code from it was used.
- Torvalds chose the GPL (GNU Public License)
- many people helped, first self hosting release (0.11) in December 1991
- Together with the tools from the GNU project there suddenly was a
complete system consisting only of free software
- Everyone called it Linux despite Linux being only the kernel
- GNU/Linux is more appropriate to differentiate to GNU/Hurd which is still experimental
- Alan Cox ported Linux to the 68k platform
- Today Linux supports x86, 68k, PowerPC (Motorola, IBM, AMCC), ARM (StrongARM, X-Scale), MIPS, System/390, etc.
- GPL and LGPL
- GPL guarantees the following freedoms (as in free speech, not free beer)
- You can run the program as you wish for any purpose
- You can study the source code and change it to do what you want
- You can make copies and distribute them
- You can distribute modified versions
- Derivative work inherits license (mistermed virus nature of GPL)
- Linking is considered derivative work in GPL but not in Lesser GNU Public License (LGPL)
-> GNU libc is most prominent LGPLd program
- BSD style licences:
- MIT: Copyright and permission notice must be included in all copies or substantial parts of the Software
- BSD: non attribution provision (contributers names may not be used to endorse or
promote products without permission), advertising clause (removed 1999)
- Apache Free License v2: strong branding, contributions become part of licensed work, license terminates if licensee initiates patent litigation
- -> BSD style licenses have no such derivative work clause and thus cannot prevent "misuse" of software (OpenBSD, FreeBSD, NetBSD)
- -> commercial application is usually intended (BSD Unix, X Window System)
- MPL (Mozilla Public License)
- somewhere in between BSD and GPL
- weak Copyleft: MPL source code modified stays MPL, but can be combined with any other code to form a proprietary program
- NPL (Netscape Public License): variant that gives original developer the right to distribute
modifications by other contributers under whatever terms it desires.
- Open Source vs. Free Software
- Open source can be any license acknowledged by the Open Source Initiative (incl. all mentioned above)
- Because of so many licenses all dubbed "open source", check which freedoms they really entail
"The nice thing about standards is that there are so many of them to choose from." A. S. Tanenbaum
C
- ANSI C (1989)
- currently: C99 (1999)
portability of other languages:
- perl, python, c++, shell scripts
Unix
- early diversity of early Unices provoked standardization at early as 1983 (UDS83)
- POSIX (name suggested by R. Stallman), first release 1990
relevant today:
- SUS (Single Unix Specification)
- SUS conformance test required to use "UNIX" name
- specifies the POSIX base (1003.1)
- in contrast to POSIX freely available
- POSIX (by IEEE) defines
- 1003.1: C API
- 1003.2: Shells and helper programs (ksh, vi, awk, sed)
- 1003.1b: realtime extensions
- 1003.1c: threads
- -> is the basis of all later standards
* FHS (Filesystem Hierarchy Standard)
-
- above standards specify "what"
- FHS specifies "where"
- Linux Standard Base
- Meta-Standard
- incorporates the above specifications and extends
- IETF RFCs
- working implementation required!
- best way to achieve portability is use of Free Software
- don't underestimate the social component
- learn to live with changing environments
- get used to look under the hood ("Read the source Luke") but use appropriate tools to help you!:
- Source code navigation:
- grep -R
- ctags
- cscope
- snavigator
- linux cross reference
- take the time to learn a decent editor
- others: git, diff -u & patch
- work with communities (and respect their rules)
- Read the FAQ before posting
- do research before asking questions
- Things to avoid on mailing lists
- HTML mails (especially on u-boot ml)
- attachments
- legal disclaimers (yeah, it can be hard)
- whitespace
- bikeshed painting
- Unix is largly self documenting:
- man pages (man man)
- info pages (info)
-
/usr/share/doc/ many hidden gems here.
- Mailing lists
- u-boot
- linuxppc-dev and linuxppc-embedded
- linux-arm-kernel and linux-arm
- linux-kernel (high volume)
- IRC channels (#mklinux and #ppc64 on freenode)
"The Unix philosophy is a set of cultural norms and philosophical
approaches to developing software based on the experience of leading
developers of the Unix operating system.", from Wikipedia.
-> It is more than simply best practices
Some important aspects:
- everything is a file
- "Make each program do one thing well." (Doug McIlroy)
- small, sharp and interconnected tools
- mechanism, not policy
- avoid premature optimization
- build prototypes
- rule of silence
- "Data dominates", Rob Pike
- "When in doubt, use brute force", Ken Thompson
- small is beautiful: "it's LOC spent, not produced", Edsger W. Dijkstra
- KISS
(mostly from "The Art of Unix Programming", Eric S. Raymond)
This section gives a very general introduction to real-time Linux with special focus on the Xenomai real-time framework
Contents of this section:
- Linux is multiuser, timesharing system
- Kernels up until 2.4 are not preemptible
(2.5 started to integrate low latency and kernel preemption patches)
- The current 2.6 kernels include major parts of the rt-reempt patchset but still no hard real-time.
- Alternatives are
- rt-preempt patch
- Dual kernel approach
- RTAI (Paolo Montegazza) is GPLd
- Xenomai
- RT-Linux (Victor Yodaiken FSM Labs) is patented . Not clear
if patent would stand up in court.
- Two-level structure is faster but more complicated
- dual kernel approach
-> Linux kernel runs as idle task
- Forked as Xenomai from RTAI/Fusion
-> Xenomai more focused on clean design than on lowest technical feasible latencies.
- Supports RTDM and userspace real-time
- ADEOS = Adaptive Domain Environment for Operating Systems by Karim Yaghmour
- simple layer between OS and hardware
- Provides Domains in which different operating systems can run side by side
- Implemented as kernel patch for linux kernel
- Xenomai is an abstract RTOS core (the Xenomai nucleus) that provides generic real-time services
- Skins (RTOS personalities) are built on top of the nucleus and provide an specific interface to applications. The following skins are supported:
- POSIX
- pSOS+ (R)
- VxWorks (R)
- VRTX (R)
- native
- uITRON
- RTAI
- Skins simplify porting from different proprietary RTOS
- real time applications can be implemented in user- or kernelspace
-> userspace should usually be preferred
- memory protection
- debugging easier
- what happens if real-time task uses linux syscall?
- IPC for inter-domain communication: RT pipes
- Differences
- VRTXsa and other proprietary OSes come from one source, whereas Linux
bears more resemblance to a toolbox with tools from different sources.
- Get support from knowledgable people you choose yourself - not only from vendor
(i.e. take your car to a garage, not to the vendor)
- Advantages
- Linux is supported on large number of systems. Support can not be abandoned
because all sources are available.
- Build environment is the same as target environment
- Huge amount of programs available (and protocols supported)
- Very good compiler suite for C/C++ (GCC) (ever traced a compiler bug?)
- Disadvantages
- Different API (although the Xenomai project supports VRTX API skin)
- Dual kernel approach
Installation:
- mount the cdrom or loop mount image (
losetup)
-> Trap: (mount exec option)
- two steps:
- as regular user: install packages
./install -d /opt/eldk-4.1 ppc_4xxFP
- as root: run the
ELDK_MAKEDEV and ELDK_FIXOWNER scripts
ELDK:
- ELDK is functional and robust workhorse
- ELDK consists of two parts:
- host components: ELDT
- target components
- everything is in one directory
- take a look at the installation directory
- CROSS_COMPILE and PATH environment variables
- compile hello_world, file(1)
- Trivial ftp (UDP based) -> indispensable for early development
- Stand alone or inet.d (xinet.d)
- edit
/etc/xinet.d/tftp to enable
- softlink /tftpboot/tftpboot -> .
- options:
rw,no_root_squash,sync
- diagnostic tools:
rpcinfo(8), showmount(8) and syslog
- pass information to target
- dynamic ip address assignment can avoid terrible problems!
subnet 10.0.0.0 netmask 255.0.0.0 {
option routers 10.0.0.2;
option subnet-mask 255.0.0.0;
option domain-name "local.net";
option domain-name-servers ns.local.net;
host trgt { hardware ethernet 00:30:BF:01:02:D0;
fixed-address 10.0.0.99;
option root-path "/opt/eldk/ppc_8xx";
option host-name "tqm";
next-server 10.0.0.2;
filename "/tftpboot/TQM8xxL/uImage";
}
}
- recent versions require
ddns-update-style
-> valid styles: none, interrim, ad-hoc
- tools: kermit (ckermit), cu, minicom
- watch out for permission / group membership issues
set line /dev/ttyS0
set speed 115200
set carrier-watch off
set handshake none
set flow-control none
robust
set file type bin
set file name lit
set rec pack 1000
set send pack 1000
set window 5
- power on the board ... is it alive?
DULG: Booting Embedded Linux on the target
DULG chapter
- first steps:
help
- environment variables
- predefined or user defined
- can contain scripts (
run)
- setting, deleting
- common pitfalls: cursor keys, using '=' in setenv, all numbers are hex!
- U-Boot will not try to stop you from doing something silly!
sequoia:
| serial# | MAC adress |
| 52192 | ethaddr 00:10:ec:00:cb:e0 |
| eth1addr 00:10:ec:80:cb:e0 |
| 50453 | ethaddr 00:10:ec:00:c5:15 |
| eth1addr 00:10:ec:80:c5:15 |
| 52173 | ethaddr 00:10:ec:00:cb:cd |
| eth1addr 00:10:ec:80:cb:cd |
kilauea:
| serial# | MAC adress |
| 082KLM2028Z | ethaddr 00:06:4B:10:20:72 |
| eth1addr 00:06:4B:10:20:73 |
| 082KLM2064Z | ethaddr 00:06:4B:10:20:8c |
| eth1addr 00:06:4B:10:20:8d |
| 082KLM2027Z | ethaddr 00:06:4B:10:20:82 |
| eth1addr 00:06:4B:10:20:83 |
| 082KLM2045Z | ethaddr 00:06:4B:10:20:9a |
| eth1addr 00:06:4B:10:20:9b |
| 081KLM2002Z | ethaddr 00:06:4B:10:1b:0b |
| eth1addr 00:06:4B:10:1b:0c |
- Using environment variables to simplify boot.
- Example: run net_nfs, flash_nfs, flash_self
git basics
- life before git (tarballs+patches, bitkeeper, many discussions about licence, reverse engineering of bk, seperate ways...")
- then use what? (requirements: distributed, robust, fast) -> none were qualified
- Linus: "I can write something better in two weeks". And he did!
- git as as an prime example of software engineering under Unix
- git objects are identified by SHA1 sums!
- objects:
- blobs (file contents)
- trees (blobs or other trees)
- commits (tree, parent commits, message)
- tags (object (usally commit), name (2.6.26), message, signature)
- one SHA1 commit describes complete history (uniquely, cryptographically strong)
- 3 stages: working dir -> index (stage area) git-object db
Configuration
introduce yourself:
$ git config --global user.name "Your Name Comes Here"
$ git config --global user.email you@yourdomain.example.com
Create a new repository
$ git-init-db
$ git-add .
$ git-commit
Working with existing repositories
Best practice: work on a branch and leave master branch unmodified (track upstream)
clone a repository:
$ git-clone git://www.denx.de/git/linux-2.6-denx.git
look at the history
$ git-log # simple
$ tig # tui
add a branch for hacking
$ git-checkout -b hacking HEAD
edit some files and look at changes in working dir
$ git diff
add modified files to the index:
$ git add file1 file2
delete some and mark them as deleted:
$ git rm file3
look at diffs of the modified files added to the index
$ git diff --cached
get some general statistics of which files were modified:
$ git status
to throw away the changes git reset HEAD, to reset the working directory git-checkout -f
commit the changes added to the index (will be prompted for commit message)
$ git commit # -s adds Signed-off-by: ... line
alternatively without using the index as an intermediate stage:
$ git commit -a
unhappy with commit message?
$ git commit --amend
local branches
what branches have we got?
git branch
create a branch as a clone of the current branch:
$ git branch hacking
check out the contents of this branch:
git checkout hacking
hack around, make a mess of it and decide to throw it away
git checkout -f
git checkout master
git branch -d hacking
git for developers
- git mostly aims at making the maintainers life easier
- conflict: maintainers expect logical separation of features but developers need to
- commit often (traceable development)
- rework code they have already committed
- some tools, e.g. stacked git make things easier for developers
- git's solution: rebasing nice description here
- idea is to commit often during development and later "squash" commits into logical ones
Example: rework last 5 commits
$ git-rebase --interactive HEAD~5
move around lines and change pick into squash in order to merge commit into previous one.
$ git rebase --abort
goodies
- whos fault is it:
git-blame
- which commit introduced the bug:
git-bisect
- maintain a clean repository:
git-gc
- check repository for consistence:
git-fsck
- remove all those untracked files:
git-clean (=-df)
- tree too big?
- use a shared repository (objects on demand)
git-clone -s
-
git-clone --reference : reference a local repository
- make a shallow copy:
git-clone --depth
- Overview U-Boot build process (three steps)
make distclean
make sequoia_config
make all
- resulting files are u-boot.bin (binary), u-boot (ELF),
u-boot.srec (S-Record) and u-boot.map (memory map)
- the make system (oldconfig, menuconfig, help, etc)
- kernel compilation: ppc, powerpc, zImage (cuImage.sequoia)
- FDT: flattened device tree
- booting
ARCH=ppc
- NFS root file system
- ramdisk
- preboot
- This variable can be used to run specific code before the normal startup
(i.e. CONFIG_BOOTDELAY loop, autostart or interactive mode) begins.
This is especially useful when its contents are generated automatically, i.e. by checking for keypresses, etc.
- update
- [todo: explain update procedures using U-Boot]
- Loading Compressed Images
- Sometimes you may want to store
some data in compressed format (for example to save flash memory), while
your application needs the data in uncompressed form. You can trick U-Boot
to do this uncompressing like this:
Generate a compressed U-Boot image of type "standalone" (="mkimage ... -T
standalone -C gzip ..."=) and make sure that the envrionment variable
autostart is set to no (i. e. enter
"setenv autostart no;saveenv"). If you then use "bootm" for such an
image, U-Boot will uncompress the contents of the image and store it at the
"load address" ("-a" option for mkimage), but not attempt to start it yet.
If the image contains executable code, you can omit the setting of
"autostart", and U-Boot will automagically start the image by jumping
to the entry point address ("-e" option for mkimage).
Definition:
"In software engineering, a design pattern is a general reusable
solution to a commonly occurring problem in software design", Wikipedia
Recall:
- Simple tools, connected by text streams
- partitioning at process level
- mechanism, not policy
(mostly from "The Art of Unix Programming", Eric S. Raymond)
- Filter
- maybe most fundamental Unix pattern
- reads on
stdin, emits to stdout
- examples: tail(1), grep(1), tr(1)
- beware of UUOCA
- Source
- no input, outputs on
stdout
- examples: ls(1), ps(1), uname(1), w(1)
- Sink
- consumes from
stdin, but doesn't output
- examples: lpr(1), mail(1),
- Cantrip
- no input, no output
- just "does" something
- examples: clear(1), touch(1)
- Compiler
- no
stdin no stdout
- takes one or two file args
- convert file 1 one and writes to file 2
- examples: gcc(1), convert(1), gpg(1), gzip(1)
some of the more important, there are more:
- the ed pattern
- takes commands interactively
- may emit outout
- examples: ed(1), gnuplot(1), gdb(1)
- can often read commands on
stdin
- the Roguelike Pattern (TUI pattern)
- text based user interface
- single stroke commands
- many examples: top(1), mutt(1), gdbtui(1), watch(1), vi(1), lynx(1)
- Driver/Engine pair
- driver and engine interact via some IPC method
- engine can run standalone
- examples: ddd+gdb, gnomebaker+cdrecord
- the CLI Server Pattern
- standalone source program
- invoked by a harness program that changes stdin and stdout
- examples: xinetd/inetd (harness) and pop3, tftp, smtp,...
- shell redirection
- shellouts: ($EDITOR, $PAGER)
- pipes (named and unamed)
- implicit synchronisation
- unidirectional
- Signals
- sent to a process
- SIGUSR1 and SIGUSR2 can be used for applications
- conventions for daemons:
-
SIGHUP: reinitalize
-
SIGTERM: gracefull shutdown
-
SIGKILL: unblockable, will kill process
-
SIGINT: ctrl-c
POSIX IPC
- shared memory
- shm_open(3), shm_unlink(3), mmap(2)
- tmpfs needs to be mounted at /dev/shm (for glibc-2.2 and above)
- semaphores
- named
- sem_open(3), sem_wait(3), sem_close(3), ...
- persistent, must be destroyed
- unnamed
- sem_init(3)
- resides in memory
- message queues
Sockets:
- full duplex
- Internet domain sockets
- Unix domain sockets
- for machines on the same machine
- faster that
PF_INET, no headers, protocls, checksums, sequence number, ACK's
- SysV IPC
- contra: own namespace
- message queues non pollable
- Threads:
- complicated!
- requires synchonization and locking
- introduces class of timing errors: hard to debug and verify
- one kills all syndrome
- many libraries not thread safe
- Lee06
Requirements:
- multiple tests should run in parallel
- GUI
- C
What we did
...
What we should have done...
- at a certain level of complexity a mini-language can help to
further sustain the mechanism and policy seperation
- two ways: internal and external DSL
- internal: use an existing language (extend or embed)
- examples: Lisp in emacs, Ruby
- external: create a new language (write a parser)
- lex and yacc
- better: ANTLR
- examples: dtc, dc, awk, graphviz
- simple tools:
lsof(1), strace(1), /proc/pid/, top1(1)
-
gdb(1) and friends
- two variants: target gdb or gdbserver
- gdbtui, ddd
- eclipse?
- statistical profiling with oprofile
Using FLASH file systems
Designing and building root file systems
- available file systems for embedded systems
- examine by: Boot-speed - RAM footprint - Flash footprint - Updates to single files - Persistency
Build Linux command line config tool:
- untar
- unzip bdisetup (create directory first)
- run make
- copy bdisetup -> /usr/local/bin
Usage:
[mrk@sokrates bdisetup]$ bdisetup
Usage of BDI setup program V1.16:
bdisetup -v [-pP] [-bB] [-s]
-v Read current versions
P Port e.g. /dev/ttyS0
B Baudrate 9, 19, 38, 57 or 115
-s if present, exit loader and start firmware
bdisetup -e [-pP] [-bB]
-e Erase firmware and logic
P Port e.g. /dev/ttyS0
B Baudrate 9, 19, 38, 57 or 115
bdisetup -u [-pP] [-bB] [-aA] [-tT] [-dD]
-u Update firmware and/or logic
P Port e.g. /dev/ttyS0
B Baudrate 9, 19, 38, 57 or 115
A Application type STD,GDB,ADA,TOR,ACC
T Target type: PPC400,MPC500,PPC600,PPC700,MPC800
MPC7400,MPC7450,MPC8200,MPC8300,MPC8500,MPC8641
ARM,ARM11,XSCALE,MIPS,MIPS64
CPU32,MCF,HC12,MCORE
D Directory with the firmware/logic files
bdisetup -c [-pP] [-bB] [-iI] [-hH] [-mM] [-gG] [-fF]
-c Program network configuration
P Port e.g. /dev/ttyS0
B Baudrate 9, 19, 38, 57 or 115
I BDI IP address e.g. 100.100.100.100
H Host IP address
M Subnet mask (default: 255.255.255.255)
G Gateway IP address (default: 255.255.255.255)
F Configuration file name
Example: Query the current bdi configuration:
(Note: this is a simple tool. It does not perform any device locking, so root permissions are required to access the serial port.)
[root@sokrates gdbpp421-1.16-4xx-2006-11-03]# bdisetup -v -p/dev/ttyS0 -b115
BDI Type : BDI2000 Rev.C (SN: 93202220)
Loader : V1.05
Firmware : V1.05 bdiGDB for MPC85xx
Logic : V1.05 PPC6xx/PPC7xx
MAC : 00-0c-01-93-20-22
IP Addr : 192.168.10.6
Subnet : 255.255.0.0
Gateway : 255.255.0.255
Host IP : 192.168.1.1
Config : bdi6.cfg
Example: Install firmware for ppc4xx processors
[root@sokrates gdbpp421-1.16-4xx-2006-11-03]# bdisetup -u -p/dev/ttyS0 -b115 -aGDB -tPPC400 -d.
Connecting to BDI loader
Erasing CPLD
Programming firmware with ./b20pp4gd.116
Erasing firmware flash ....
Erasing firmware flash passed
Programming firmware flash ....
............................................................................................................................................
Programming firmware flash passed
Programming CPLD with ./pp4jed21.103
............................................................................................................................................................................................................................................................................
Programming CPLD passed
[root@sokrates gdbpp421-1.16-4xx-2006-11-03]#
Check everything went ok:
[root@sokrates gdbpp421-1.16-4xx-2006-11-03]# bdisetup -v -p/dev/ttyS0 -b115
BDI Type : BDI2000 Rev.C (SN: 93202220)
Loader : V1.05
Firmware : V1.16 bdiGDB for PPC400
Logic : V1.03 PPC400
MAC : 00-0c-01-93-20-22
IP Addr : 192.168.10.6
Subnet : 255.255.0.0
Gateway : 255.255.0.255
Host IP : 192.168.1.1
Config : bdi6.cfg
Setting Configuration Parameters:
[root@sokrates gdbpp421-1.16-4xx-2006-11-03]# bdisetup -c -p/dev/ttyS0 -h192.168.5.1 -i192.168.5.21 \
-m255.255.0.0 -g255.255.255.255 -fsequoia/sequoia.cfg
Connecting to BDI loader
Writing network configuration
Configuration passed
then check if it went ok (note the -s to exit loader and start the firmware):
[root@sokrates gdbpp421-1.16-4xx-2006-11-03]# bdisetup -v -p/dev/ttyS0 -b115 -s
BDI Type : BDI2000 Rev.C (SN: 93202220)
Loader : V1.05
Firmware : V1.16 bdiGDB for PPC400
Logic : V1.03 PPC400
MAC : 00-0c-01-93-20-22
IP Addr : 192.168.5.21
Subnet : 255.255.0.0
Gateway : 255.255.255.255
Host IP : 192.168.5.1
Config : /tftpboot/sequoia/sequoia.cfg
Quick Test:
[root@sokrates gdbpp421-1.16-4xx-2006-11-03]# telnet 192.168.5.21
Trying 192.168.5.21...
Connected to bdi (192.168.5.21).
Escape character is '^]'.
BDI Debugger for Embedded PowerPC
=================================
MD [<address>] [<count>] display target memory as word (32bit)
MDH [<address>] [<count>] display target memory as half word (16bit)
MDB [<address>] [<count>] display target memory as byte (8bit)
DUMP <addr> <size> [<file>] dump target memory to a file
MM <addr> <value> [<cnt>] modify word(s) (32bit) in target memory
MMH <addr> <value> [<cnt>] modify half word(s) (16bit) in target memory
MMB <addr> <value> [<cnt>] modify byte(s) (8bit) in target memory
MC [<address>] [<count>] calculates a checksum over a memory range
MV verifies the last calculated checksum
RD [<name>] display general purpose or user defined register
RDUMP [<file>] dump all user defined register to a file
RDSPR <number> display special purpose register
RDDCR <number> display device control register
RM {<nbr>�<name>} <value> modify general purpose or user defined register
RMSPR <number> <value> modify special purpose register
RMDCR <number> <value> modify device control register
TLB <from> [<to>] display TLB entry
WTLB <idx> <epn> <rpn> write TLB entry (only PPC440)
DFLUSH [<addr>] flush data cache (addr = cached memory address)
IFLUSH invalidate instruction cache
DCACHE <from> [<to>] display L1 data cache (440: lines, 405: sets)
ICACHE <from> [<to>] display L1 inst cache (440: lines, 405: sets)
BOOT reset the BDI and reload the configuration
RESET [HALT | RUN [time]] reset the target system, change startup mode
BREAK [SOFT | HARD] display or set current breakpoint mode
GO [<pc>] set PC and start target system
GO <n> <n> [<n>[<n>]] start multiple cores in requested order
TI [<pc>] trace on instuction (single step)
TC [<pc>] trace on change of flow
HALT stop all cores via HALT pin
STOP [<n>[<n>[<n>[<n>]]]] stop core(s) via JTAG port (n = core number)
BI <addr> set instruction breakpoint
CI [<id>] clear instruction breakpoint(s)
BD [R|W] <addr> set data breakpoint (32bit access)
BDH [R|W] <addr> set data breakpoint (16bit access)
BDB [R|W] <addr> set data breakpoint ( 8bit access)
CD [<id>] clear data breakpoint(s)
INFO display information about the current state
LOAD [<offset>] [<file> [<format>]] load program file to target memory
VERIFY [<offset>] [<file> [<format>]] verify a program file to target memory
PROG [<offset>] [<file> [<format>]] program flash memory
<format> : SREC or BIN or AOUT or ELF
ERASE [<address> [<mode>]] erase a flash memory sector, chip or block
<mode> : CHIP, BLOCK or SECTOR (default is sector)
ERASE <addr> <step> <count> erase multiple flash sectors
UNLOCK [<addr> [<delay>]] unlock a flash sector
UNLOCK <addr> <step> <count> unlock multiple flash sectors
FLASH <type> <size> <bus> change flash configuration
DELAY <ms> delay for a number of milliseconds
SELECT <core> change the current core
HOST <ip> change IP address of program file host
PROMPT <string> defines a new prompt string
CONFIG display or update BDI configuration
CONFIG <file> [<hostIP> [<bdiIP> [<gateway> [<mask>]]]]
HELP display command list
JTAG switch to JTAG command mode
QUIT terminate the Telnet session
- TARGET: waiting for target Vcc
440EPx>quit
Connection closed by foreign host.
[root@sokrates gdbpp421-1.16-4xx-2006-11-03]#
- where does board start after reset (reset vector)?
440EPx>info
Core number : 0
Core state : debug mode
Debug entry cause : JTAG stop request
Current PC : 0xfffffffc
Current CR : 0x28000082
Current MSR : 0x00000000
Current LR : 0x0ff91484
440EPx>
- figure out flash layout: FLASH_BASE, CFG_MONITOR_LEN, TEXT_BASE
440EPx>erase 0xfffa0000
Erasing flash at 0xfffa0000
Erasing flash passed
440EPx>erase 0xfffc0000
Erasing flash at 0xfffc0000
Erasing flash passed
440EPx>erase 0xfffe0000
Erasing flash at 0xfffe0000
Erasing flash passed
440EPx>prog
# enter filename and host IP address
440EPx>host 192.168.5.1
Host IP address is 192.168.5.1
440EPx>prog 0xfffa0000 /tftpboot/sequoia/u-boot.bin bin
Programming /tftpboot/sequoia/u-boot.bin , please wait ....
Programming flash passed
440EPx>
- two phases: before and after relocation
- simple debugging with bdi
- with gdb
- before relocation
- after relocation
cd /opt/eldk-4.1/ppc_4xxFP/usr/src/linux-2.6.19.2-xenomai/
make mrproper
make ARCH=ppc sequoia_defconfig
configure additional options from "Real-time subsystem" toplevel menu:
make ARCH=ppc menuconfig
and then run make:
make ARCH=ppc uImage
the resulting uImage is a xenomai / adeos enable kernel
- download Xenomai from www.xenomai.org
$ scripts/prepare-kernel.sh --linux=<linux-srctree> [--adeos=<adeos-patch>] [--arch=<target-arch>]
example:
../../git/xenomai/scripts/prepare-kernel.sh --linux=. --arch=i386 --adeos=../adeos-ipipe-2.6.23-i386-1.10-10.patch
- configure (Options are available from the "Real-time subsystem" toplevel menu) and build kernel
- build the userspace tools (automake) and install
- Detailed instructions for (cross-) building can be found in README.INSTALL in the Xenomai sources
Options are:
- t 3 : one thread
- p 80 : with highest priority 80
- i 1000 : 1000us = 1ms
- l 10000 : 10000 loops
cyclictest -t 3 -p 80 -i 1000 -l 10000
There are quite a few scripts/tools out there that will help
in automatically generating a toolchain, a root filesystem,
a ROM image, and other components of an embedded Linux system.
This is a collection of such tools, scripts, tipps and tricks:
- buildroot: http://www.uclibc.org/cgi-bin/cvsweb/buildroot/
This is a set of scripts maintained by Erik Andersen (uClibc,
BusyBox, TinyLogin, etc.) that can be used to build quite a
few components (including toolchain for host and target
binaries.) Though I haven't used buildroot, the content of
the scripts is one of the best I've seen (well organized
and thorough).
- PTXdist: http://www.pengutronix.de/software/ptxdist_en.html
This tool is maintained by Robert Schwebel (utelnetd, RTAI
documentation, etc.) that creates root filesystems. It
relies on the use of a KConfig-based frontend. There are
nice screenshots.
- Bill Gatliff's build script:
http://billgatliff.com/twiki/bin/view/Crossgcc/BuildToolchainScript
This is one script which builds a toolchain made by Bill
Gatliff (gdb stubs, regular contributor to Embedded Systems
Programing, reviewed my book, etc.)
- Craig Hollabaugh's build scripts:
http://www.embeddedlinuxinterfacing.com/sourcecode.shtml
These are the scripts Craig used in his "Embedded Linux"
book. There is a buildtoolchain.tar.gz, which is the set
of script used to build the toolchain, and there's
buildrootfilesystem, which is used to ... well, build
the root filesystem
- Richard Atterer's build script:
http://home.in.tum.de/~atterer/debian/
Again, a script to build a toolchain.
- David A. Desrosiers' build script:
http://libarynth.f0.am/cgi-bin/view/Libarynth/CrossCompiler
Yet another toolchain build script.
- "Peewee Linux":
http://www.peeweelinux.com
"an environment that makes the configuration and installation of a Linux operating system on an embedded platform as easy and painless as possible"
- http://www.bluewaternz.com/startoff/armlinux.htm
[For ARM targets] a script which conveniently builds binutils, cross compiler, glibc, and the kernel.
[From: Karim Yaghmour et. al. on the etux mailing list.]
Pointers to more information are colleted in the DULG document
in the MoreIformation topic.
Well, that's all for now folks.
Happy Hacking! and may the source be with you.
- 21 year old Linus Torvalds starts "educational" project in 1991
- Tanenbaum's Minix served as a Model, but no code from it was used.
- Torvalds chose the GPL (GNU Public License)
- many people helped, first self hosting release (0.11) in December 1991
- A.B.C (e.g. 2.4.23):
- A = kernel version
- B = major version (old: even=stable, odd=devel)
- C = minor version
- versioning changed from 2.6.11 on:
- Stable kernels A.B.C.D. D incremented on bugfixes
- C incremented every couple of months with larger changesets
- other naming A.B.C-xx (rc, mm, ac, ck)
Definition:
- A Device Driver is a core software component of an operating system that abstracts a device and provides a well defined interface for generic core code
Characteristics:
- "contact" with userspace or not
Kernel components a driver interacts with:
- File system (device files, file operations ...)
- Process management
- bad example:
current->state = TASK_INTERRUPTIBLE
- user vs. kernel space
- resource management
- availability of resources (memory, stackspace, libraries)
- memory protection
- batch vs. event driven programming
- concurrency
- GPL and LGPL
- GPL guarantees the following freedoms (as in free speech, not free beer)
- You can run the program as you wish for any purpose
- You can study the source code and change it to do what you want
- You can make copies and distribute them
- You can distribute modified versions
- Derivative work inherits license (mistermed virus nature of GPL)
- Linking is considered derivative work in GPL but not in Lesser GNU Public License (LGPL)
-> GNU libc is most prominent LGPLd program
- BSD style licences:
- MIT: Copyright and permission notice must be included in all copies or substantial parts of the Software
- BSD: non attribution provision (contributers names may not be used to endorse or
promote products without permission), advertising clause (removed 1999)
- Apache Free License v2: strong branding, contributions become part of licensed work, license terminates if licensee initiates patent litigation
- -> BSD style licenses have no such derivative work clause and thus cannot prevent "misuse" of software (OpenBSD, FreeBSD, NetBSD)
- -> commercial application is usually intended (BSD Unix, X Window System)
- MPL (Mozilla Public License)
- somewhere in between BSD and GPL
- weak Copyleft: MPL source code modified stays MPL, but can be combined with any other code to form a proprietary program
- NPL (Netscape Public License): variant that gives original developer the right to distribute
modifications by other contributers under whatever terms it desires.
- Open Source vs. Free Software
- Open source can be any license acknowledged by the Open Source Initiative (incl. all mentioned above)
- Because of so many licenses all dubbed "open source", check which freedoms they really entail
- Mailing lists: linux-kernel (high load) and respective arch lists
git basics
- life before git (tarballs+patches, bitkeeper, many discussions about licence, reverse engineering of bk, seperate ways...")
- then use what? (requirements: distributed, robust, fast) -> none were qualified
- Linus: "I can write something better in two weeks". And he did!
- git as as an prime example of software engineering under Unix
- git objects are identified by SHA1 sums!
- objects:
- blobs (file contents)
- trees (blobs or other trees)
- commits (tree, parent commits, message)
- tags (object (usally commit), name (2.6.26), message, signature)
- one SHA1 commit describes complete history (uniquely, cryptographically strong)
- 3 stages: working dir -> index (stage area) git-object db
Configuration
introduce yourself:
$ git config --global user.name "Your Name Comes Here"
$ git config --global user.email you@yourdomain.example.com
Create a new repository
$ git-init-db
$ git-add .
$ git-commit
Working with existing repositories
Best practice: work on a branch and leave master branch unmodified (track upstream)
clone a repository:
$ git-clone git://www.denx.de/git/linux-2.6-denx.git
look at the history
$ git-log # simple
$ tig # tui
add a branch for hacking
$ git-checkout -b hacking HEAD
edit some files and look at changes in working dir
$ git diff
add modified files to the index:
$ git add file1 file2
delete some and mark them as deleted:
$ git rm file3
look at diffs of the modified files added to the index
$ git diff --cached
get some general statistics of which files were modified:
$ git status
to throw away the changes git reset HEAD, to reset the working directory git-checkout -f
commit the changes added to the index (will be prompted for commit message)
$ git commit # -s adds Signed-off-by: ... line
alternatively without using the index as an intermediate stage:
$ git commit -a
unhappy with commit message?
$ git commit --amend
local branches
what branches have we got?
git branch
create a branch as a clone of the current branch:
$ git branch hacking
check out the contents of this branch:
git checkout hacking
hack around, make a mess of it and decide to throw it away
git checkout -f
git checkout master
git branch -d hacking
git for developers
- git mostly aims at making the maintainers life easier
- conflict: maintainers expect logical separation of features but developers need to
- commit often (traceable development)
- rework code they have already committed
- some tools, e.g. stacked git make things easier for developers
- git's solution: rebasing nice description here
- idea is to commit often during development and later "squash" commits into logical ones
Example: rework last 5 commits
$ git-rebase --interactive HEAD~5
move around lines and change pick into squash in order to merge commit into previous one.
$ git rebase --abort
goodies
- whos fault is it:
git-blame
- which commit introduced the bug:
git-bisect
- maintain a clean repository:
git-gc
- check repository for consistence:
git-fsck
- remove all those untracked files:
git-clean (=-df)
- tree too big?
- use a shared repository (objects on demand)
git-clone -s
-
git-clone --reference : reference a local repository
- make a shallow copy:
git-clone --depth
- Let's take a look at the sources!
- the make system (oldconfig, menuconfig, help, etc)
- kernel compilation: ppc, powerpc, zImage (cuImage.sequoia)
- FDT: flattened device tree
Code:
/*
* A first simple hello_world kernel module
*/
#include <linux/init.h>
#include <linux/module.h>
static int hello_init(void)
{
printk(KERN_ALERT "Hello world!\n");
return 0;
}
static void hello_exit(void)
{
printk(KERN_ALERT "Goodbye, cruel world\n");
}
module_init(hello_init);
module_exit(hello_exit);
Makefile:
obj-m := hello.o
Build it:
make -C ~/git/linux-2.6-denx/ M=`pwd` modules ARCH=powerpc
loading
Extentions:
-
__init, __initdata, __exit, __exitdata (include/linux/init.h)
-
__devinit / __devexit (normal function if !CONFIG_HOTPLUG, else equal __init / __exit
-
MODULE_AUTHOR, MODULE_LICENSE, MODULE_DESCRIPTION
- Kernel symbol visibility:
static, EXPORT_SYMBOL[_GPL[_FUTURE]](a)
- check out kernel symbols:
/proc/kallsyms (nm(1))
- Two ways to build modules: statically linked in vs dynamically loadable
- Extensive documentation:
Documentation/kbuild/
- build a kernel with module statically compiled in
Makefile:
#
# Slightly more advanced Makefile for building the hello world module
#
ifneq ($(KERNELRELEASE),)
obj-m := hello.o
else
KDIR := /home/mk/git/linux-2.6-denx
PWD := $(shell pwd)
default:
$(MAKE) -C $(KDIR) SUBDIRS=$(PWD) modules
endif
clean:
rm -rf .*.cmd .tmp* *.ko *~ *.o *.mod.c *.symvers
-
printk is most important debug facility!
- loglevels (
include/linux/kernel.h)
- kernel logging (
/proc/kmsg)
- default loglevel in /proc/sys/kernel/printk
- current, default, minimum, boot time default
- changing loglevel (write to proc,
dmesg)
- Character driver is most common type of driver
- Character drivers transfer arbitrary amounts of characters
- Communication via device files (major and minor numbers)
- major and minor numbers
- used by kernel to identify driver that handles request
- many statically assigned (
Documentation/devices.txt) (but no new ones will be)
- need to be allocated (=claimed)
- static or dynamic allocation
- mapping number -> device can vary (1:1, n:1)
API: Device numbers
type:
dev_t
include/linux/types.h
creation, accessors:
MKDEV(int major, int minor)
MAJOR(dev_t dev)
MINOR(dev_t dev)
include/linux/kdev_t.h
register a (known) range of numbers:
int register_chrdev_region(dev_t first, unsigned int count, char *name);
- first: first number in range
- count: how many minor numbers?
- name: name of device
allocate a range of free numbers:
int alloc_chrdev_region(dev_t *dev, unsigned int firstminor, unsigned int count, char *name);
- dev: pointer to existing dev_t, will contain allocated major/minor
- firstminor: usually 0 (but not required)
- name: name of device
free allocated or registered range of numbers:
void unregister_chrdev_region(dev_t first, unsigned int count);
include/linux/fs.h
- inform kernel about what operations the driver provide for a device
-
struct file_operations in include/linux/fs.h
- error handling (goto's, errno)
- carefull: driver can be called as soon as it has registered! (
cdev_add)
API: Character driver registration
type:
struct cdev
allocate at runtime:
(need to add fops by hand)
struct cdev* cdev_alloc(void);
...
my_cdev->ops = &my_fops;
allocate statically and only initalize at runtime:
void cdev_init (struct cdev* cdev, const struct file_operations *fops);
for both cases initalize owner field to THIS_MODULE:
my_cdev.owner = THIS_MODULE;
register the character driver with the kernel:
int cdev_add(struct cdev *dev, dev_t num, unsigned int count);
- cdev: the initalized struct cdev
- num: first device number of device
- count: number of minor numbers
remove (and if necessary free memory):
void cdev_del(struct cdev *dev);
include/linux/cdev.h
old way (do not use anymore):
int register_chrdev(unsigned int major, const char *name, struct file_operations *fops);
void unregister_chrdev(unsigned int major, const char *name);
- open
- called when device opened
- prepares actual device for use (initalize it, enable interrupts ...)
- every open gets its own
filp (this allow virtual devices: one for each open)
- release
- called when device is finally closed (once)
- multiple close (
fork, dup) avoided by ref counting.
- flush is called for every
close(), but seldomly used.
- does cleanup (stop device, free memory, ...)
- Problem: how identify which device has been opened?
-
imajor(), iminor()
- use
container_of macro to find superstructure from inode->i_cdev
API
open and release:
int open(struct inode *i, struct file *filp)
int release(struct inode *i, struct file *filp)
find device numbers given an inode:
unsigned int imajor(struct inode* i);
unsigned int iminor(struct inode* i);
find containing structure:
container_of(container_field_ptr, container_type, name_of_container_field);
- read and write data to device
- return values:
- = count: the requested amount of bytes were read/written
- < count: count bytes were read/written (no error!)
- = 0: read: EOF, write: unspecified (old: equals EAGAIN)
- < 0: error
-
stdio library handles partial reads/writes
- carefull with userspace pointers!
- could be unmapped in kernel space
- could be swapped out
- could be invalid!
API
read and write file operations:
ssize_t read(struct file *filp, char __user *buff, size_t count, loff_t *f_pos);
ssize_t write(struct file *filp, const char __user *buff, size_t count, loff_t *f_pos);
userspace access:
unsigned long copy_to_user(void __user *to, const void *from, unsigned long count);
unsigned long copy_from_user(void *to, const void __user *from, unsigned long count);
- normal (blocking) behavour
-
read: process blocks if no data is available and may return
less than requested
-
write: process blocks if output buffer is full. write
returns once free space becomes available in output
buffer. Partial write possible.
- non-blocking behaviour
- open with
O_NONBLOCK
- both read and write return
-EAGAIN
- allows polling device
API: waitqueues
type:
wait_queue_head_t
linux/wait.h
Declaration:
/* compile time */
DECLARE_WAIT_QUEUE_HEAD(name);
/* run time */
wait_queue_head_t q;
init_waitqueue_head(&q);
Sleeping:
wait_event(queue, condition)
wait_event_interruptible(queue, condition)
wait_event_timeout(queue, condition, timeout)
wait_event_interruptible_timeout(queue, condition, timeout)
waking up sleeping processes:
void wake_up(wait_queue_head_t *queue);
void wake_up_interruptible(wait_queue_head_t *queue);
specialities:
- exclusive waits: wake only one process on queue (avoids
thundering herd).
wake_up wakes up all non-exclusive waiters
and one exclusive waiter.
-
wake_up_nr, wake_up_interruptible_nr: wake up to X exclusive
waiters
-
wake_up_interruptible_sync: avoid (possible) immediate
rescheduling so that wake_up will return.
deprecated:
void sleep_on(wait_queue_head_t *q);
void interruptible_sleep_on(wait_queue_head_t *q);
View from Userspace:
- process can check if one or more files can be read from or
written to without blocking
- this allows to work with multiple file descriptors
- example: telnet: reads from socket and terminal, can't block on either!
- But why? We've got:
-
O_NONBLOCK, we can poll? -> yes, but this wastes CPU time!
- threads -> yes, but threads are considered problematic for many reasons
View from Kernelspace:
- driver:
- call
poll_wait on one or more waitqueues that inform of
change of poll=/=select status (typically read and write
queue)
- return a bitmask describing what operations can be performed
without blocking.
- kernel: if a requested operation is possible the kernel
returns, otherwise sets process to sleep on waitqueues and handles
timeouts.
API: poll
header and poll operation
#include <linux/poll.h>
unsigned int poll(struct file *filp, poll_table *wait);
step 1: add waitqueues
void poll_wait(struct file*, wait_queue_head_t *, poll_table *);
step 2: return appropriate status (more in asm/poll.h)
/* readable */
return (POLLIN | POLLRDNORM);
/* writeable */
return (POLLOUT | POLLWRNORM);
- allows userspace direct access to device memory
- benefits:
- avoids buffering: can improve performace
- direct access to registers can avoid many
ioctl
- only makes sense in certain circumstances: e.g. not for stream
oriented devices
API: mmap
file operation
int *mmap (struct file *filp, struct vm_area_struct *vma);
build page tables
int remap_pfn_range(struct vm_area_struct *vma, unsigned long virt_addr,
unsigned long pfn, unsigned long size, pgprot_t prot);
/* identical to above on most plattforms: */
int io_remap_pfn_range(struct vm_area_struct *vma, unsigned long virt_addr,
unsigned long phys_addr, unsigned long size, pgprot_t prot);
- simple and effective
- works with default kernel
- little impact on timing (compared to hardware debuggers)
- avoid cluttering syslog: log kernel messages to file during testing
- stop klogd:
killall klogd
- and restart with log to file option
klogd -f /tmp/mymessages
- change kernel loglevel (which messages are printed to console)
-
/proc/sys/kernel/printk
- off
dmesg -n 1, full dmesg -n 8
simple
#define DEBUG
#ifdef DEBUG
# define _DBG(fmt, args...) printk(KERN_DEBUG "%s: " fmt "\n", __FUNCTION__, ##args)
#else
# define _DBG(fmt, args...) do { } while(0);
#endif
/* example usage */
void read_this(int count)
{
_DBG("reading count=%d bytes", count);
}
Debug levels:
#define DEBUG
#define DEBUG_LEVEL_LOW
#define DEBUG_LEVEL_MEDIUM
#define DEBUG_LEVEL_HIGH
static int debug = DEBUG_LEVEL_MEDIUM; /* change with ioctl or proc */
#ifdef DEBUG
# define _DBG(x, fmt, args...) do{ if (debug>=x) printk(KERN_CRIT "%s: " fmt "\n", __FUNCTION__, ##args); } while(0);
#else
# define _DBG(x, fmt, args...) do { } while(0);
#endif
/* example usage */
void function1()
{
_DBG(DEBUG_LEVEL_LOW, "addr1=0x%x, addr2=0x%x", addr1, addr2);
}
void fatal_error()
{
_DBG(DEBUG_LEVEL_HIGH, "this is the end");
}
Debug selective subsystems:
#define DEBUG
#define DEBUG_READ (1<<0)
#define DEBUG_WRITE (1<<1)
#define DEBUG_LOCKS (1<<2)
static unsigned int debug_mask = 0;
#ifdef DEBUG
# define _DBG(x, fmt, args...) do{ if (debug_mask & x) printk(KERN_DEBUG "%s: " fmt "\n", __FUNCTION__, ##args); } while(0);
#else
# define _DBG(x, fmt, args...) do { } while(0);
#endif
/* example usage */
void read()
{
_DBG(DEBUG_READ, "inside read function");
}
void write(int count)
{
_DBG(DEBUG_WRITE, "writing count=%d bytes", count);
}
- occurs when driver accesses invalid pointer or NULL pointer
- MMU signals page fault but page is invalid: oops message is
printed. (arch/*/kernel/traps.c)
Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0xd107c0c8
Oops: Kernel access of bad area, sig: 11 [#1]
Sequoia
Modules linked in: faulty chrdrv [last unloaded: faulty]
NIP: d107c0c8 LR: c0072834 CTR: 0fe368a4
REGS: cf9cfe40 TRAP: 0300 Not tainted (2.6.25-rc2-00102-g0640ebb)
MSR: 00029000 <EE,ME> CR: 48242028 XER: 00000007
DEAR: 00000000, ESR: 00800000
TASK = cfa5d000[950] 'bash' THREAD: cf9ce000
GPR00: d107c0c0 cf9cfef0 cfa5d000 00000000 4801f000 00000004 cf9cff20 00000000
GPR08: 00000000 00000000 00000004 00000000 00000004 100fd4b8 00000000 00000000
GPR16: 10125058 1009e4f0 100f0000 100f7998 00000000 00000000 101219a8 00000000
GPR24: 00000001 100ac1ac bfaf4918 cf9cff20 4801f000 cfa2ba00 0ff448a8 00000004
NIP [d107c0c8] faulty_write+0x8/0x10 [faulty]
LR [c0072834] vfs_write+0xcc/0x16c
Call Trace:
[cf9cfef0] [c0072804] vfs_write+0x9c/0x16c (unreliable)
[cf9cff10] [c0072ec8] sys_write+0x4c/0x90
[cf9cff40] [c000cf20] ret_from_syscall+0x0/0x3c
Instruction dump:
38810008 7cbdf850 480000d1 7c7d1a14 4bffffc4 7fc3f378 38810008 7fe5fb78
480000b9 4bffffb0 39200000 38600000 <91290000> 4e800020 7c0802a6 9421fff0
---[ end trace 4d8ed6ffb404bf6e ]---
- sometimes debuggers can be usefull
- Abatron BDI2000 is our tool of choice
- Profiling is a technique for discouvering performance bottlenecks
- For Linux two techniques available:
- ) full coverage analysis with gcov LWN article
- not included in mainline
- kernel only
- ) statistical profiling with OProfile[2]
- included in mainline and ELDK contains userspace support
- works out of the box for kernel and userspace
- OProfile
- Enable in Kernel:
- General Setup -> Profiling Support (Experimental)
- Kernel Hacking -> Compile Kernel with Debug Info
simple session (multiply example taken from [1])
# initalize oprof
$ opcontrol --init
# where's our kernel?
$ opcontrol --vmlinux=/tmp/vmlinux
# clear old data
$ opcontrol --reset
# available events?
$ opcontrol -l
# start daemon and start collecting data
$ opcontrol --start
# run application
$ while ./multiply; do ./multiply; echo -n .; done
.........
# dump data to daemon
$ opcontrol --dump
# stop daemon
$ opcontrol --stop
# examine
$ opcontrol -l ./multiply
# annotate
$ opannotate -s ./multiply
- motivation (state pre 2.5):
- no uniform, comprehensive model, only bus specific lists of devices
-
procfs was getting messy (large files, not hierarchy)
- more and more hotpluggable devices
- generic power and shutdown management impossible
- solution: driver model
- introduced in 2.5, evolved a lot!
- uniform model for representing busses, devices and drivers as a hierarchy
- sysfs
- often confused with driver model
- special file system that exports hierarchical view of all the devices in the system
- driver model can exist without sysfs, but sysfs not without driver model
- Characteristics:
- discoverable (device classes)
- how interconnected?
- cleaner: one value per sysfs file
- handles reference counting
- generic hotplug support
- The driver model is very complex, but fortunately the kernel does most work for us
example
$ tree -L 1 /sys/
/sys/
|-- block
|-- bus
|-- class
|-- devices
|-- firmware
|-- fs
|-- kernel
|-- module
`-- power
explore with tree(1)
Simplified example:
- bus driver is loaded (Platform, USB, PCI...)
- knows how to match devices and drivers
- usually knows how to detect a device on the bus
- maybe generates hotplug events for userspace
- bus driver detects devices:
- statically by platform code
- during bus driver initalization
- when a device is hotplugged
- somewhere during startup some drivers get loaded
- bus driver tries to match when new drivers or devices appear
- matching:
- driver usually exports table of handled devices
- matching is bus dependant, can not be implemented generically
- no match && hotplugging enabled:
- old: setup environment and call hotplug function (/proc/sys/kernel/hotplug)
- today: udev/udevd handles this
- try to autoload / modprobe correct module
- if success retry matching
- if match:
- call driver probe() function:
- if < 0: failure
- if >= 0: ok, driver is bound to device
- topic still quite volatile
Resources
-
Documentation/powerpc/booting-without-of.txt
- describes very detailed how Linux works with device trees.
- Embedded Power Architecture Platform Requirements (ePAPR)
- APIs and code
-
include/linux/of.h
-
include/linux/of_platform.h
-
include/linux/of_device.h
-
arch/powerpc/include/asm/prom.h
- code is in
drivers/of/, arch/powerpc/kernel/of_*
- SOC glue code in
arch/powerpc/sysdev/fsl_soc.c
- drivers in
drivers/
- dtc
- merged into kernel but already old a4da2e3ec84cda635ac441efbe781a38d2ee41ee
- better get it here: git://www.jdl.com/software/dtc
- questions to linuxppc-dev or irc #mklinux
- Kernel memory
- Memory mapping
- Accessor functions
- a LED driver
- a class LED driver
- Interrupt handling
- Delaying work - Top and bottom halves
- we must take care not to conflict with other drivers
- Linux kernel only sees virtual addresses -> we cannot simply
access physical memory.
- we must take that compiler optimization / cpu reordering doesn't bite us, but forget about
volatile! (Documentation/volatile-considered-harmfull)
- especially powerpc people: beware of big endian registers!
API:
avoid conflicts with other drivers:
struct resource *request_mem_region(unsigned long start, unsigned long len, char *name);
void release_mem_region(unsigned long start, unsigned long len);
this shows up in /proc/iomem:
-bash-3.2# cat /proc/iomem
90000000-97ffffff : /plb/pciex@0a0000000
98000000-9fffffff : /plb/pciex@0c0000000
ef600200-ef600207 : serial
ef600300-ef600307 : serial
ef600400-ef600411 : ibm_iic
ef600500-ef600511 : ibm_iic
ef6c0000-ef6cffff : dwc_otg.0
ef6c0000-ef6cffff : dwc_otg
f8000000-f8001fff : ndfc-nand.0
fc000000-ffffffff : fc000000.nor_flash
map physical address into kernel address space:
void *ioremap(unsigned long phys_addr, unsigned long size);
void iounmap(void *addr);
avoid compiler optimization problems:
void barrier();
avoid compiler and hardware reordering problems:
mb(void)
rmb(void)
wmb(void)
be portable and avoid above problems:
void io(write|read)(8|16|32)[be] functions
be versions for big endian registers!
irqreturn_t handler(int irq, void *dev_id);
- register it with the kernel...
int request_irq(unsigned int irq,
irq_handler_t handler,
unsigned long irqflags,
const char *devname,
void * dev_id);
- ...enable device to generate interrupts
int fiddle_with_registers();
- and handle! (when done undo in reverse order)
- But... this is not the case anymore for ARCH=powerpc! (but still for ARM)
- Virtual vs. real interrupts
- motivation: we can't do all work in the irq handler (for now at
least -> PREEMPT_RT can run irqs as kthreads)
- available mechanisms
- very old: bottom halves
- deprecated but available: tasklets (disadvantages: run in softirq context -> must be atomic, no userspace access, high prio -> block userspace process, etc)
- state of art: workqueues
- Managed resource API
- Advanced debugging techniques: LTT, UML
- RCU
- Realtime Linux
- special virtual filesystem
- exports kernel information to userspace
- entries can be dynamically added
- ok for debugging but not for drivers to be merged upstream!
(use
sysfs instead)
API
see procfs-guide in kernel DocBook documentation
- provides way to implement control commands
- deprecated but widely used
- basically a (command) number is sent to driver
API example:
#define DIGIO_IOC_MAGIC 'C'
/* parameter for DIGIO_SET_VAL ioctl */
struct digio_par {
unsigned int pinnr;
unsigned int val;
};
/*
* ioctl definitions
*/
#define DIGIO_GET_VAL _IOR(DIGIO_IOC_MAGIC, 0, struct digio_par *)
#define DIGIO_SET_VAL _IOW(DIGIO_IOC_MAGIC, 1, struct digio_par *)
#define DIGIO_RESET_ALL _IO(SCULL_IOC_MAGIC, 2)
#define DIGIO_IOC_MAXNR 2
static int digio_ioctl (struct inode *inode, struct file *file, uint cmd, unsigned long arg)
{
int err;
struct digio_par par;
/* sanity checks */
if(_IOC_TYPE(cmd) != DIGIO_IOC_MAGIC) return -ENOTTY;
if(_IOC_NR(cmd) > DIGIO_IOC_MAXNR) return -ENOTTY;
switch(cmd) {
case DIGIO_GET_VAL:
if((err = digio_get_pin(par.pinnr, &par.val)) < 0)
goto out;
if(copy_to_user((unsigned int *)arg, &par, sizeof(struct digio_par)) != 0) {
printk(KERN_CRIT "%s: copy_to_user failed", __FUNCTION__);
err = -EFAULT;
goto out;
}
break;
case DIGIO_SET_VAL:
if(copy_from_user(&par, (unsigned int *)arg, sizeof(struct digio_par))) {
err = -EFAULT;
printk(KERN_CRIT "%s: copy_from_user failed", __FUNCTION__);
goto out;
}
digio_set_pin(par.pinnr, par.val);
break;
case DIGIO_RESET_ALL:
digio_reset_all();
break;
default:
/* this is defined by POSIX */
return -ENOTTY;
}
/* all ok */
err = 0;
out:
return err;
}
Terms
- Concurrency (dt.: Nebenläufigkeit): two or more threads run at
the same time and interact
- race condition (dt.: Wettlaufsituation): inconsistent results
due to uncontrolled access to shared data
- deadlock (dt.: Blockierung): two or more processes are
indefinetly waiting for each other to free access to a resource
- critical path (dt.: kritischer Weg): codesegment which accesses
a shared resource
- mutual exclusion (dt. wechselseitiger Ausschluss): techniques
used to protect critical paths against race conditions
Concurrency in the Linux kernel
- in old non-preemptive kernels (before 2.5):
- 2.5 and newer:
- kernel preemption
- SMP (other CPU)
- interrupts
- delayed code execution (workqueues, tasklets, timers)
=> we need to protect our critical paths!
Advice:
- ordering: always obtain multiple locks in the same order (avoids
deadlocks)
- start with coarse locking and refine
- document which functions need to be called with which locks held
- allow atomic access of variable
- set, read, test, add and subtract values
- good for protecting small data (e.g. counter, ...)
- very simple (-> hard to mess up
)
- bitmask operations also available
- implementation is architecture specific
example:
initalize:
atomic_t counter ATOMIC_INIT(0); /* or */
atomic_set(&counter, 0);
use:
atomic_add(3, &counter);
atomic_inc(&counter);
atomic_sub(2, &counter);
atomic_dec(&counter);
val = atomic_read(&counter);
operate and test:
if (atomic_sub_and_test(1, &counter) {
printk(KERN_INFO "counter is zero!\n");
}
...
- binary semaphore: locked or unlocked
- used for protecting critical paths
- restrictions:
- one task can hold mutex at a time
- only task which locks can unlock again
- no recursive/multiple locking/unlocking
- can't be used in interrupt context (hard or soft)
- Advantages (over semaphores):
- Plattform independant
- faster than semphores (smaller!)
- code may sleep
-
DEBUG_MUTEXES allows effective debugging
- Disadvantages
- only binary (no counting)
API: Kernel Mutexes
initalize:
#include <linux/mutex.h>
DEFINE_MUTEX(name);
at runtime:
mutex_init(struct mutex *lock);
locking:
void mutex_lock(struct mutex *lock);
int mutex_lock_interruptible(struct mutex *lock);
int mutex_trylock(struct mutex *lock);
unlocking:
void mutex_unlock(struct mutex *lock);
testing state:
int mutex_is_locked(struct mutex *lock);
- fast, busy waiting mutual exclusion mechanism
- Mechanism:
- Atomic variable is initalized to 1: available
-
spin_lock() atomically decrements value and checks if equals 0
- if yes: lock obtained
- if no: no lock, spin in tight loop until value is 1
-
spin_unlock() sets value back to 1
- must only be used in code that cannot sleep!
- no userspace memory access
- no memory allocation (except
GFP_ATOMIC)
- three flavors
-
spin_lock(), spin_unlock()
- on UP, no preemption: optimized out
- on UP with preemption: disable preemption
- protects against concurrency with "regular" kernel code
-
spin_lock_irqsave, spin_unlock_irqrestore, spin_lock_irq, spin_unlock_irq
- disables interrupts on local CPU
- protects against concurrency with interrupt handlers
-
spin_lock_bh, spin_unlock_bh
- disables soft interrupt on local CPU
- protects against concurrency with bottom halves, tasklets, softirqs, ...
reader/writer semaphores and spinlocks:
- situation: many reader, few writers
- allow multiple readers to access the data simultaneously, but only one writer
completions:
- situation: start something and be notified once it completes
- previously done with semaphores (deprecated in favor of mutexes)
RCU: read-copy-update:
- detailed LWN article
- advanced, high performance lock-free mechanism
- situation: many reads, few writes
- mechanism (extremely simplified):
- references to protected data must be held only by atomic code
- when protected data must be changed, writer makes copy, changes it and then updates pointer
- problem: code on other CPUs might hold reference to old version
- solution: because code holding rcu reference must be atomic, old ref must be gone after reschedule of all CPUs
PI mutexes:
- situation: low priority threads holding a lock shared with an important high priority thread is preempted by a CPU intensive medium priority thread: low priority thread can't finish and free-lock so high priority thread starves.
- mechanism: a thread holding a pi-mutex inherits priority of highest waiter
- interesting: Mars Pathfinder experienced this problem. read here
seqlocks:
- provide fast lockless access to shared resource
- situation: resource is small, simple and frequently read. write access is rare but must be fast
- mechanism: readers have free access, but must check for collisions with writers (and then retry)
- simplest form
kmalloc(size, flags)
- flags see
linux/gfp.h
- may sleep
GFP_KERNEL, or not GFP_ATOMIC
- fast, continuous, doesn't clear (security? ->
kzalloc)
- free with
kfree(ptr)
- more memory needed? allocate pages:
-
get_zeroed_page(flags)
-
__get_free_page(flags) /* non zeroed */
-
__get_free_pages(flags, order)
- allocate 2^order pages
- calculate oder from size:
get_order(size)
- freeing
-
free_page(ptr)
-
free_pages(ptr, order)
- advantages: slightly faster and avoids fragementation
-
vmalloc / vfree
- above functions allocate physically consecutive memory
-
vmalloc allocated memory is only virtually consecutive
- cons:
- slower because page tables need to be setup
- not usable from outside of CPU (DMA)
- pro:
- large allocation less likely to fail due to fragementation
- -> ok for e.g. large software only buffer
- Optimizations (premature opt ...)
- Situation: driver allocates many objects of the same size over and over again
- lookaside cache (see slab cache in DocBook "Memory Management in Linux")
- nice: cache usage information in /proc/slabinfo
- Sitation: memory allocation is not allowed to fail
- mempool can be used for reserving memory in emergencies
- Usually bad: memory is idle. Better deal with allocation failures
- Linux timekeeping:
HZ, jiffies
- sometimes we need to delay execution, for example
- short: device initialization, timeouts, ...
- longer: need to do something at certain time
bad: burning CPU cycles (don't use)
#define DELAY_MS 100
unsigned long j = jiffies + DELAY_MS * HZ / 1000;
while (time_before(jiffies, j1))
cpu_relax();
-
cpu_relax means not doing much. Calls architecture dependent
code, actually only a memory barrier on powerpc.
slightly better but still wasteful:
while (time_before(jiffies, j1))
schedule();
- others can run, but system stays loaded (we stay in run queue of
the scheduler)
recommended for small (busy waiting) delays
void ndelay(unsigned long nsecs);
void udelay(unsigned long usecs);
void mdelay(unsigned long msecs);
better: sleep instead of busy waiting if possible:
void msleep(unsigned int ms);
/*
* interruptible version (for waitqueues)
* return 0 or if interrupted remaining time
*/
unsigned long msleep_interruptible(unsigned int ms);
void ssleep(unsigned int sec);
- Interesting:
msleep(1) can sleep for up to 20ms
lkml thread.
Proposed solution: implement msleep with hrtimers.
longer delays: use waitqueues!:
wait_queue_head_t wait;
init_waitqueue_head(&wait);
/* condition is always false=0, delay in jiffies */
wait_event_interruptible_timeout(wait, 0, delay);
- linux features standard implementations for various types and functions:
- linked lists
- circular, double linked list (linux/list.h)
- klists, more sophisticated list with locking and ref counting (linux/klist.h)
- kfifos (linux/kfifo.h)
- string functions
- bit and bitmap operations
- (commandline) parsing
- crc functions
- ...
make htmldocs
firefox Documentation/DocBook/index.html
-> kernel api
API: linked lists
type:
struct list_head {
struct list_head *next, *prev;
};
Initalize head pointer:
struct list_head mempool;
INIT_LIST_HEAD(&mempool); /* at runtime or ... */
LIST_HEAD(mempool); /* at compile time */
Embedd struct list_head in list element:
struct mem_chunk {
unsigned char *ptr;
int len;
struct list_head list;
}
add element:
struct mem_chunk mc;
mc.ptr = this;
mc.len = 128;
list_add(&mc.list, &mempool);
remove element:
struct list_head myelement = list_del(&mempool.next);
find superstructure:
mc = list_entry(myelement, struct mem_chunk, list);
Functions:
see kernel-api documentation and include/list.h
-
procfs (deprecated, don't use anymore)
- sysfs (nice, complex, not for debugging)
- debugfs (simple and easy for debugging)
- relayfs (for high speed kernel-userspace transfer)
- netlink (socket oriented, asynchronous, used for routing, firewall, etc. easy to extend)
- ioctl (deprecated, messy, don't use)
Task:
Write a small module that creates a proc entry that can be read or written. Text written is rot13 scrambled and can be read back with the same proc entry.
- Processing shall not be done within the proc write function, but delayed with a workqueue
- Take care of race conditions!
- No buffering etc. required; only the last written text can be read back.
Tips:
static void rot_buf(char *buf, int rotate)
{
int x;
for (x=0; buf[x] != '\0'; x++) {
if (islower (buf[x]))
buf[x] = ((buf[x] + rotate - 'a') % 26) + 'a';
else if (isupper (buf[x]))
buf[x] = ((buf[x] + rotate - 'A') % 26) + 'A';
else if (isdigit (buf[x]))
buf[x] = ((buf[x] + rotate - '0') % 10) + '0';
}
}
- c library functions:
include/linux/ctype.h