Linux System Administration – Getting Started

Elvis Plesky

4 years ago

If you’re new to Linux system administration this guide offers you some useful tips and an overview of some of the common issues that may cross your path. Whether you’re a relative newcomer or a Linux administration stalwart, we hope that this collection of Linux commands will prove useful.

Basic Configuration

One of your first tasks in the administration of Linux is configuring the system, but it’s a process that often throws up a few hurdles. That’s why we’ve collected some tips to help you ‘jump’ over them. Let’s go through it:

Set the Hostname

Use these commands to set the hostname correctly:

hostname

hostname -f

The first one needs to show your short hostname, while the one that follows it should show your FQDN—fully qualified domain name.

Setting the Time Zone

In Linux administration, setting your service time zone to the one that most of your users share is something that they’ll no doubt appreciate. But if they’re scattered across continents then it’ll be better to play it safe and go for UTC – Universal Coordinated Time, also known as GMT – Greenwich Mean Time.

Operating systems all have their own ways of letting you switch time zones:

Setting the Time Zone in Ubuntu or Debian

Type this next command and answer the questions that pop up when prompted:

dpkg-reconfigure tzdata

Setting the Time Zone in Arch Linux or CentOS 7

See the list of time zones that are available:
timedatectl list-timezones

Use the Up, Down, Page Up and Page Down keys to select the one you’re after, then either copy it or write it down. Hit q to exit.

Set the time zone (change UK/London to the correct zone):
timedatectl set-timezone 'UK/London'

Manually set the Time Zone – Linux System Administration

Locate the correct zone file in /usr/share/zoneinfo/ and link it to /etc/localtime. Here are some examples:

Universal Coordinated Time:

ln -sf /usr/share/zoneinfo/UTC /etc/localtime

Eastern Standard Time:

ln -sf /usr/share/zoneinfo/EST /etc/localtime

American Central Time (including Daylight Savings Time):

ln -sf /usr/share/zoneinfo/US/Central /etc/localtime

American Eastern Time (including Daylight Savings Time):

ln -sf /usr/share/zoneinfo/US/Eastern /etc/localtime

Configure the /etc/hosts File

In Linux System Administration the /etc/hosts file offers a list of IP addresses and their matching hostnames. This lets you set hostnames for an IP address in one location on the local machine, and then have many applications link to outside resources using their hostnames. The system of host files goes before DNS, so hosts files will always be referenced before a DNS query. This means that /etc/hosts can help you maintain small “internal” networks which as someone involved with Linux administration you might want to use in development or for managing clusters.

It’s a requirement of some applications that the machine identifies itself properly in the /etc/hosts file. Because of this, we strongly suggest you configure the /etc/hosts file not long after deployment.

127.0.0.1 localhost.localdomain localhost

103.0.113.11 username.example.com username

You can specify some hostnames separated by spaces on each line. Each of those lines needs to start with no more than one IP address. In the example above, swap out 103.0.113.11 for the IP address of your machine. Consider some extra /etc/hosts entries:

198.51.100.20 example.com

192.168.1.1 stick.example.com

Here, every request for the example.com domain or hostname is going to resolve to the IP address 198.51.100.20, which circumvents the DNS records for example.com and returns an alternative website.

The second line requests that the system looks to 192.168.1.1 for the domain stick.example.com. These types of host entries make administration of Linux easier – they are helpful for using “back channel” or “private” networks to get into other servers belonging to a cluster without the need to route traffic over the public network.

Network Diagnostics

Now let’s take a look at some simple Linux commands that there are useful for assessing and diagnosing network problems. If you think you might be having connection problems, you can add the output from the appropriate commands to your support ticket. This will assist staff in resolving your issues. If your network problems are happening intermittently then this can be especially helpful.

The ping Command

The ping command lets you test the quality of the connection between the local machine and an external machine or address. These commands “ping” google.com and 215.48.207.120:

ping google.com

ping 215.48.207.120

They send an ICMP packet, which is a small amount of data to the remote host, then they await a response. If the system can make a connection, it will let you know the “round trip time” for each packet. Here’s what that looks like for four pings to google.com:

PING google.com (216.58.217.110): 56 data bytes

64 bytes from 216.58.217.110: icmp_seq=0 ttl=54 time=17.721 ms

64 bytes from 216.58.217.110: icmp_seq=1 ttl=54 time=15.374 ms

64 bytes from 216.58.217.110: icmp_seq=2 ttl=54 time=15.538 ms

The time field tells you how long each individual packet took to complete the round trip in milliseconds. In Linux Administration, when you’ve got all the information you want, you can interrupt the process using Control+C. It will then show you some statistics that look like this:

--- google.com ping statistics ---

4 packets transmitted, 4 received, 0% packet loss, time 3007ms

rtt min/avg/max/mdev = 34.880/41.243/52.180/7.479 ms

These are the ones you should take note of:

Packet Loss, this takes the difference between how many packets were sent and how many came back to you and expresses it as a percentage.
Round Trip Time (rtt) tells you all the ping responses. “min” is the fastest packet round trip, and in this case, it took 34.88 milliseconds. “avg” is the average round trip, and that took 41.243 milliseconds. “max” is the longest a packet took, which was 52.18 milliseconds. “mdev” shows a single standard deviation unit, and for these four packets, it was 7.479 milliseconds.

In your administration of Linux, the ping command is useful for giving you a rough measure of point-to-point network latency, and if you want to establish that you definitely are connected to a remote server then this is the tool that can tell you.

The traceroute Command

The traceroute command tell you a bit more than the ping command. It can trace the packet’s journey from the local machine to the remote machine and report the number of hops (meaning each step using an intermediate server) it took on the way. This can be useful when you’re investigating a network issue because packet loss in one of the first few hops tells you that the problem may be with the user’s Internet service provider (ISP) or local area network (LAN), rather than your administration of Linux. But if packets were being shared near the end of the route, this could indicate a problem with the service connection.

This is what output from a traceroute command typically looks like:

traceroute to google.com (74.125.53.100), 30 hops max, 40 byte packets

1 207.192.75.2 (207.192.75.2) 0.414 ms 0.428 ms 0.509 ms

2 vlan804.tbr2.mmu.nac.net (209.123.10.13) 0.287 ms 0.324 ms 0.397 ms

3 0.e1-1.tbr2.tl9.nac.net (209.123.10.78) 1.331 ms 1.402 ms 1.477 ms

4 core1-0-2-0.lga.net.google.com (198.32.160.130) 1.514 ms 1.497 ms 1.519 ms

5 209.85.255.68 (209.85.255.68) 1.702 ms 72.14.238.232 (72.14.238.232) 1.731 ms 21.031 ms

6 209.85.251.233 (209.85.251.233) 26.111 ms 216.239.46.14 (216.239.46.14) 23.582 ms 23.468 ms

7 216.239.43.80 (216.239.43.80) 123.668 ms 209.85.249.19 (209.85.249.19) 47.228 ms 47.250 ms

8 209.85.241.211 (209.85.241.211) 76.733 ms 216.239.43.80 (216.239.43.80) 73.582 ms 73.570 ms

9 209.85.250.144 (209.85.250.144) 86.025 ms 86.151 ms 86.136 ms

10 64.233.174.131 (64.233.174.131) 80.877 ms 216.239.48.34 (216.239.48.34) 76.212 ms 64.233.174.131 (64.233.174.131) 80.884 ms

The hostnames and IP addresses sitting before and after a failed jump can help you determine whose machine is involved with the routing error. Lines with three asterisks (* * *) indicate fail jumps.

If you’re trying to fix network issues or someone like your ISP is looking into it for you then traceroute output can help track down the problem, and recording traceroute information can really help when the issue only happens infrequently.

The mtr Command

As with the traceroute tool, the mtr command is important in Linux System Administration. It can tell you about the route that internet traffic takes between the local system and a remote host. However, mtr also gives you extra information about the round-trip time for the packet, too. Think of mtr as a bit like a mixture of traceroute and ping.

An output from an mtr command might look like this:

HOST: username.example.com Loss% Snt Last Avg Best Wrst StDev

256.129.75.4 0.0% 10 0.4 0.4 0.3 0.6 0.1
vlan804.tbr2.mmu.nac.net 0.0% 10 0.3 0.4 0.3 0.7 0.1
0.e1-1.tbr2.tl9.nac.net 0.0% 10 4.3 4.4 1.3 11.4 4.1
core1-0-2-0.lga.net.google.com 0.0% 10 64.9 11.7 1.5 64.9 21.2
209.85.255.68 0.0% 10 1.7 4.5 1.7 29.3 8.7
209.85.251.9 0.0% 10 23.1 35.9 22.6 95.2 27.6
72.14.239.127 0.0% 10 24.2 24.8 23.7 26.1 1.0
209.85.255.190 0.0% 10 27.0 27.3 23.9 37.9 4.2
gw-in-f100.1e100.net 0.0% 10 24.1 24.4 24.0 26.5 0.7

As with the ping command, mtr is great for Linux administration. In this case it tells you real-time connection speed. Use CONTROL+C to stop it manually and use the –report flag to make it stop automatically after 10 packets and produce a report, like this:

mtr --report

Don’t be surprised when it pauses while it’s producing the output. This is perfectly normal.

Linux System Diagnostics

If you’re having trouble with your system and it’s not related to networking or some other application problem, it might be useful to rule out hardware and issues at the operating system level. These tools can help you diagnose and fix such problems.

If you discover a problem with memory usage, you can use these tools and methods to find out exactly what’s causing it.

Check Level of Current Memory Use

Use this command:

free -m

Possible output should look like this:

total used free shared buffers cached

Mem: 1997 898 1104 105 34 699

-/+ buffers/cache: 216 1782

Swap: 255 0 255

Output like this will require some close reading to understand. It’s saying that the system is using 898 megabytes of memory (RAM) out of a total 1997 megabytes, and 1104 megabytes our free. Although, there’s also 699 megabytes of stale data in the system, buffered and Held in the cache. The operating system will empty its caches if more space is required, but it will hold onto a cache if no other process wants to use it. A system that uses Linux Administration will usually leave old data sitting in RAM until it’s needed for something else, so don’t worry if it looks like there is very little free memory.

In the example above, there are only 1782MB of free memory, which means that’s all that any extra process application will have left to work with.

Use vmstat to Monitor I/O Usage

The vmstat tool tells you about memory, swap utilization, I/O wait, and system activity. It’s especially good for the diagnosis of I/O-type difficulties. Here’s an example:

vmstat 1 20

This runs a vmstat every second for twenty seconds, so it will pick up a sample of the current system state. Here’s how the output will typically look:

procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----

r b swpd free buff cache si so bi bo in cs us sy id wa

0 0 4 32652 47888 110824 0 0 0 2 15 15 0 0 100 0

0 0 4 32644 47888 110896 0 0 0 4 106 123 0 0 100 0

0 0 4 32644 47888 110912 0 0 0 0 70 112 0 0 100 0

0 0 4 32644 47888 110912 0 0 0 0 92 121 0 0 100 0

0 0 4 32644 47888 110912 0 0 0 36 97 136 0 0 100 0

0 0 4 32644 47888 110912 0 0 0 0 96 119 0 0 100 0

0 0 4 32892 47888 110912 0 0 0 4 96 125 0 0 100 0

0 0 4 32892 47888 110912 0 0 0 0 70 105 0 0 100 0

0 0 4 32892 47888 110912 0 0 0 0 97 119 0 0 100 0

0 0 4 32892 47888 110912 0 0 0 32 95 135 0 0 100 0

The memory and swap columns give you the same kind of information as the “free -m” command, although in a format that’s a little more difficult to comprehend. The last column in most installations provides the most relevant information—the wa column. It shows how long the CPU spends idling while it waits for I/O operations to be completed.

If the number there is frequently a lot greater than 0, then this points to an I/O usage issue, but if the vmstat output is similar, don’t worry, because it’s not that.

Administration of Linux is sometimes hit with an intermittent issue, so run vmstat when it happens to let you diagnose it correctly, or at least discount the possibility of an I/O issue. Any support staff helping you will welcome vmstat output to help them diagnose problems.

Monitor Processes, Memory, and CPU Usage with htop

You can get a more ordered view of your system’s state in real time by using htop. You’ll have to add it to most systems yourself, and, depending on your distribution, you’ll use one of these commands to do so:

apt-get install htop

yum install htop

pacman -S htop

emerge sys-process/htop

To start it, type:

htop

Press the F10 or Q keys at any time when you want to quit. Some htop behaviors may seem hard to fathom to start with, so be aware of the following:

The memory utilization graph shows cached memory, used memory and buffered memory, while the numbers displayed at the end of it indicate the total amount that’s available and the total amount installed as reported by the kernel.
The htop default configuration shows all application threads as separate processes, which might not be obvious if you weren’t aware of it. If you prefer to disable this then select the “setup” option with F2, then “Display Options,” and then toggle “Hide userland threads”.

The F5 key lets you toggle a “Tree” view that arranges the processes in a hierarchy. This is handy because it lets you see which processes were spawned by other processes and it shows it in an organized way. This can help you diagnose an issue when it’s hard to tell one process from another.

File System Management

The FTP protocol has often been used by web developers and editors to manage and transfer files on a remote system. But the problem with FTP is that it’s very insecure and doesn’t offer a very efficient way of managing for managing the files on your system when you have SSH access.

If you’re new to Linux systems administration you might want to use WinSCP instead, with rsync used to synchronize files using SSH and the terminal.

Uploading Files to a Remote Server

If you have used an FTP client before, the OpenSSH is similar, and you can use it over the SSH protocol. Dubbed “SFTP,” numerous clients such as WinSCP for Windows, Cyberduck for Mac OS X, and Filezilla for Linux, OS X, and Windows desktops support this protocol.

If you’re familiar with FTP, then you’ll be comfortable with SFTP. If you’ve got access to a file system at the command line then you’ll automatically have the same access over SFTP, so bear this in mind when you set up user access.

You can also use Unix utilities such as scp and rsync to securely transfer your files. A command to copy team-info.tar.gz on a local machine would look like:

scp team-info.tar.gz username@hostname.example.com:/home/username/backups/

After the scp command comes the path of the file on the local file system that you want to transfer, followed by the username and hostname of the remote machine separated by an “@” symbol. Use a colon (:) after the hostname and then put the path on the remote server where the file will be uploaded to. Here’s a less specific example:

scp [/path/to/local/file] [remote-username]@[remote-hostname]:[/path/to/remote/file]

OS X and Linux machines make this command available by default. It’s useful for copying files between remote servers in Linux Administration. If you use SSH keys, you can use the scp command without needing a password for each transfer.

The syntax of scp follows the form scp [source] [destination]. If you want to do the reverse operation and copy files from a remote host to your local machine and simply swap destination and source.

Protecting Files on a Remote Server

As someone involved with Linux Administration, it’s important to maintain file security when you let a number of users have network access to your network-accessible servers.

Best practices for security include:

Only giving users the minimum permissions required for whatever tasks they need to complete.
Only running services on public interfaces that are in active use. A frequent source of security vulnerabilities comes from unused daemons that have been left running, and this holds equally true for database servers, HTTP development servers, and FTP servers, too.
When you can, use SSH connections to encrypt any sensitive information that you want to transfer.

Symbolic Links

Symbolic linking, often referred to as “symlinking”, lets you create objects in your file system that can point to other objects. This is useful in the Administration of Linux if you want to let users and applications access particular files and directories without having to reorganize all your folders. This approach lets users have restricted access to your web-accessible directories without moving your DocumentRoot to their home directories.

Type a command in the following format to set up a symbolic link:

ln -s /home/username/config-git/etc-hosts /etc/hosts

This creates a link of the file etc-hosts at the location of the system’s /etc/hosts file. More generically:

ln -s [/path/to/target/file] [/path/to/location/of/sym/link]

Here are some features of the link command to be aware of:

The location of the link, which is the last term, can be left out, and if you do that, then one with the same name as the file you’re linking to will be created in the current directory.
When specifying the link location, make sure that the path doesn’t have a slash at the end. You can produce a symlink that targets a directory, but make sure that it doesn’t end with a slash.
If you take out a symbolic link this won’t affect the target file.
When you create a link, you can use relative or absolute paths.

Managing Files on a Linux System

If you’re new to handling files via the terminal interface as part of your Linux system administration role, here’s a list of basic commands to help you.

To copy files:

cp /home/username/todo.txt /home/username/archive/todo.01.txt

This will copy todo.txt to an archive folder and then append a number to the file name. If you want to repeatedly copy every file and subdirectory in one directory into another, use -R in the command like this:

cp -R /home/username/archive/ /srv/backup/username.01/

To move a file or directory:

mv /home/username/archive/ /srv/backup/username.02/

You can also rename a file using use the mv command.

To delete a file:

rm scratch.txt

This deletes the scratch.txt file from the current directory.

Package Management

Administration of Linux is made much easier by the package management tools that come with the majority of Linux systems. These make it simple to centrally install and maintain your system’s software. Installing your software manually makes it harder to manage dependencies and keep your system up to date. Package management tools help keep you on top of the majority of such tasks, so here are some basic package management tasks for use in Linux administration.

Track Down Packages Installed on Your System

Packages are easy to install and they often produce multiple dependencies that can be easy to lose sight of. These commands list all the packages installed on your system:

On Debian and Ubuntu systems:

dpkg -l

This example shows the first few lines of the output of this command on a production Debian Lenny system.

||/ Name Version Description

+++-============================-============================-===============================

ii adduser 3.110 add and remove users and groups

ii apache2-mpm-itk 2.2.6-02-1+lenny2 multiuser MPM for Apache 2.2

ii apache2-utils 2.2.9-10+lenny4 utility programs for webservers

ii apache2.2-common 2.2.9-10+lenny4 Apache HTTP Server common files

ii apt 0.7.20.2+lenny1 Advanced front-end for dpkg

ii apt-utils 0.7.20.2+lenny1 APT utility programs

ii bash 3.2-4 The GNU Bourne Again SHell

On CentOS and Fedora systems:

yum list installed

This example shows a few lines of the output from this command:

MAKEDEV.i386 3.23-1.2 installed

SysVinit.i386 2.86-15.el5 installed

CentOS and Fedora systems show the name of the package (SysVinit), the architecture it was compiled for (i386), and the build version installed on the system (2.86-15.el5).

For Arch Linux systems:

pacman -Q

This command pulls up a complete list of the packages installed on the system. Arch also lets you filter the results so that it only shows those packages that were explicitly installed (with the -Qe option) or that were installed automatically as dependencies (with the -Qd option). The command above is actually a combination of the output of two commands:

pacman -Qe

pacman -Qd

Here’s an example of the output:

perl-www-mechanize 1.60-

perl-yaml 0.70-1

pkgconfig 0.23-1

procmail 3.22-2

python 2.6.4-1

rsync 3.0.6-1

On Gentoo Linux systems:

emerge -evp --deep world

Here’s an example of this output:

These are the packages that would be merged, in order:

Calculating dependencies... done!

[ebuild R ] sys-libs/ncurses-5.6-r2 USE="unicode -debug -doc -gpm -minimal -nocxx -profile -trace" 0 kB

[ebuild R ] virtual/libintl-0 0 kB

[ebuild R ] sys-libs/zlib-1.2.3-r1 0 kB

Because it’s usual for so many packages to be installed on most systems, these commands can produce quite a large output, so it can be used for tools like grep and less to narrow your results. For example:

dpkg -l | grep "python"

This will pull up a list of all packages where the name or description features the word “python.” You can also use less in a similar way:

dpkg -l | less

This gives you the same list as the basic “dpkg -l; but the results will appear in the less pager, which will let you search and scroll more easily.

Adding | grep “[string]” to these commands will let you filter package list results, or with all distributions you can add | less to show the results in a pager.

Finding Package Names and Information

The name of the package isn’t always intuitive, because it doesn’t always look like the name of the software. That’s why many package management tools exist to help you search the package database. Such tools are great for finding a particular piece of software when you don’t know its name and they make Linux Administration a lot easier.

For Debian and Ubuntu systems:

apt-cache search [package-name]

This searches the local package database for a particular term and then produces a list with descriptions. Here’s some of the output for apt-cache search python :

txt2regex - A Regular Expression "wizard", all written with bash2 builtins

vim-nox - Vi IMproved - enhanced vi editor

vim-python - Vi IMproved - enhanced vi editor (transitional package)

vtk-examples - C++, Tcl and Python example programs/scripts for VTK

zope-plone3 - content management system based on zope and cmf

zorp - An advanced protocol analyzing firewall

groovy - Agile dynamic language for the Java Virtual Machine

python-django - A high-level Python Web framework

python-pygresql-dbg - PostgreSQL module for Python (debug extension)

python-samba - Python bindings that allow access to various aspects of Samba

Be aware that apt-cache search queries all the records relating to every package and not just the titles and the descriptions shown here, which is why vim-nox and groovy are included, as both mention python in their descriptions. To view the complete record on a package use:

apt-cache show [package-name]

This will tell you about the maintainer, the dependencies, the size, the upstream project’s homepage, and the software’s description.

On CentOS and Fedora systems:

yum search [package-name]

This creates a list of all the packages in the database matching the given term. Here’s what the output of yum search wget typically looks like:

Loaded plugins: fastestmirror

Loading mirror speeds from cached hostfile

* addons: centos.secsup.org

* base: centos.secsup.org

* extras: centos.secsup.org

* updates: styx.biochem.wfubmc.edu

================================ Matched: wget =================================

wget.i386 : A utility for retrieving files using the HTTP or FTP protocols.

The package management tools can tell you more about any individual package. To get a complete list from the package database use this command:

yum info [package-name]

This output will give you more detailed information about the package, its purpose, origins and dependencies.

On Arch Linux systems:

pacman -Ss [package-name]

This will search the local package database. Here’s a snippet from the results that a search for “python” would bring up:

extra/twisted 8.2.0-1

Asynchronous networking framework written in Python.

community/emacs-python-mode 5.1.0-1

Python mode for Emacs

The terms “extra” and “community” tell you where the software is sitting. To ask for additional information regarding a particular package, your command should be set out like this:

pacman -Si [package-name]

If you run pacman with the -Si option, it will get the record for the package from the database that includes a brief description, package size and dependencies.

For Gentoo Linux systems:

emerge --search [package-name]

emerge --searchdoc [package-name]

The first command will just look for package names in the database. The second one will search for both names and descriptions. These commands will let you search your local package tree (i.e., portage) for a particular package name or term. The output of either command will look similar to the example below.

Searching...

[ Results for search key : wget ]

[ Applications found : 4 ]

* app-emacs/emacs-wget

Latest version available: 0.5.0

Latest version installed: [ Not Installed ]

Size of files: 36 kB

Homepage: http://pop-club.hp.infoseek.co.jp/emacs/emacs-wget/

Description: Wget interface for Emacs

License: GPL-2

Since the output you’ll get from the emerge –search command will be so long-winded, there isn’t a tool to show you more information, unlike in some of the other distributions. If you want to narrow your search results down even more you can use regular expressions with the emerge –search command.

In Linux administration, produce a lot of text, so tools like grep and less can be very useful for making the results more easy to scroll through. For example:

apt-cache search python | grep "xml"

This will bring up all those packages that matched for the search term “python” and that also have “xml” somewhere in their name or description. In the same way:

apt-cache search python | less

This will give you the same list as the simple apt-cache search python but the results will be displayed in the less pager. This makes it easier to search and scroll.

If you add | grep “[string]” to these commands it will filter package search results, or you can use | less to show the results in the less pager. This works across all distributions.

Text Manipulation

On Linux and UNIX-like systems, the vast majority of system configuration information is held in plain text format, so next up are some basic Linux commands and tools for working with text files.

Search for a String in Files with grep

In Linux system administration the grep tool lets you search for a term or regex pattern within a stream of text, like a file or the output from a command.

Let’s look at how to use the grep tool:

grep "^Subject:.*HELP.*" /home/username/mbox

This will search your email subject headers which begin with any amount of characters, and which contain the word “help” in capital letters and are followed by any number of extra characters. It would then show the results in the terminal.

The grep tool gives some extra options, and if you use them, they force the program to return the context for each match (e.g., with -C 2 for two lines of context). With -n, grep it produces the line number of the match. With -H, grep it gives you the file name of each match, which is handy when you “grep” a group of files or when you repeatedly “grep” through a file system (using -r). Type grep –help for extra options.

To grep a collection of files, you can specify the file using a wildcard:

grep -i "jones" ~/org/*.txt

This will return every time the word “jones,” shows up. Case gets ignored because of the -i instruction. The grep tool will search all files in the ~/org/ directory that have got a .txt extension.

You can use it to filter the results from a different command that sends output to standard out (stdout). It manages this by “piping” the output of one command into grep. For example:

ls /home/username/data | grep "7521"

In this example, we assume that there are a lot of files with a UNIX timestamp in their file names in the /home/username/data directory. The command will filter the output so it only shows files with the digits “7521” in their file names. In these cases, grep only filters the output of ls and doesn’t check the contents of the file itself.

Search and Replace In a Group of Files

The sed tool, or the Stream EDitor, can search for a regex pattern and replace it with another string. Use it as an alternative to the grep tool, which is strong on text filtering of regular expressions, but not as good with editing a file or otherwise manipulating text.

Do be warned that sed is powerful enough to do a lot of damage if you don’t know how to wield it safely, so we suggest that you make backups so you can test your sed commands in safety before you run them. Here’s a simple sed one-liner, to demonstrate its syntax:

sed -i `s/^good/BAD/` singularity.txt

This replaces any appearances of the word “good” at the beginning of a line (noted by the ^) with the string “BAD” in the file singularity.txt. The -i option tells sed to do the replacements “in place.” The sed command can produce backups of the files that it edits if you include a suffix after the -i option, as in -iBAK. In the above example, it would back up the original file as morning-star.txt.BAK before making changes.

A sed statement is generally formatted to look like:

's/[regex]/[replacement]/'

To match literal slashes (/), you must escape them by using a backslash (\), which is to say that if you want to match a / character you would need to use \/ in the sed expression. When searching for a string with a number of slashes, you can swap them for a different character. For example:

's|r/e/g/e/x|regex|'

This would remove the slashes from the string r/e/g/e/x so that it would become regex after the sed command was run on the file that contains the string.

This example searches and replaces one IP address with another. In this case, 97.22.58.33 is replaced with 87.65.33.31:

sed -i 's/97\.22\.58\.33/87\.65\.33\.31/'

Here, period characters are escaped as \.. In regular expressions, the full-stop (period) character matches with any character if you don’t escape it.

Edit Text

You’ll often need to use a text editor to edit the contents of a file, and some distribution templates include the vi/vim and nano text editors. Both are small yet powerful tools that are at home manipulating text in the terminal environment.

Other options are available though, including emacs and “zile.” Use your operating system’s packet manager to install these programs if you want. Be sure to search your package database in order to install a version that has been compiled without GUI components (i.e. X11).

To open a file, type a command that begins with the name of the editor you would like to run then the name of the file you want to edit. Here are some examples of commands that open the /etc/hosts file:

nano /etc/hosts

vi /etc/hosts

emacs /etc/hosts

zile /etc/hosts

Once you’ve edited a file, save and exit the editor to get back to the prompt. The actual procedure as a bit different with each editor. In emacs and zile it’s the same key sequence. You hit ctrl, x and s to save, usually written as “C-x C-s” and then it’s “C-x C-c” to close the editor. In nano, use Control-O (written as \^O) and confirm the file name to write the file. Hit Control-X to exit.

For administration of Linux it helps to know that vi and vim are modal editors, and the way they work is a little more complicated. After you open a file in vi, you press the “I” key to switch to insert mode, which will allow you to edit text in the usual way. To save the file, you need to go back into “normal” mode, so just press the escape key (Control-[ also works), and type:wq to write the file and exit the program.

This is just a brief introduction to using these text editors in Linux system administration, but there are many online resources available online that will help you go from beginner to expert.

Webservers and HTTP Issues

It’s best to install and configure your webserver in a way that best suits your application or website. Let’s go over a number of basic webserver tasks and functions and offer some advice for beginners.

Serve Websites

Webservers work by listening on a TCP port, usually port 80 for HTTP and port 443 for HTTPS. When a visitor requests content, the servers respond by delivering it. Resources are usually specified with a URL that has the protocol, http or https; a colon and two slashes, ://; hostname or domain, www.example.com or username.example.com; followed by a file path, /images/avatar.jpg, or index.html. A complete URL would look something like: http://www.example.com/images/avatar.jpg.

To offer these resources to visitors, your system must be running a webserver. There are lots of different HTTP servers and endless configurations to support various web development frameworks. The three recommended webservers for general use are Apache HTTP server, Lighttpd, and Nginx. There are pluses and minuses for all of them, and the one you choose will largely depend on a combination of your needs and your experience.

Once you’ve decided which webserver go for, you need to decide what (if any) scripting support you need to install. Scripting support lets your webserver run dynamic content and also program server-side scripts in languages like Python, PHP, Ruby, and Perl.

How to Choose a Webserver

Most visitors don’t know which webserver you use so the one you choose really comes down to your own requirements and preferences. This can make Linux system administration a challenge for anyone new to it, so let’s consider some of your choices.

The Apache HTTP Server is thought by many to be the ideal webserver. It’s the open-source option that’s used more than any other, its configuration interface has enjoyed many years of stability and its modular architecture suits all kinds of deployments. Apache is the basis of the LAMP stack, and it helps to integrate dynamic server-side apps into the webserver.

The thing with webservers like Lighttpd and nginx is that they’re more weighted towards serving static content efficiently. If you’re dealing with high demand and limited server resources then one of these servers might be the better option. Lighttpd and nginx offer stability and functionality and they don’t strain system resources, but on the downside, they can be harder to configure when you want to integrate dynamic content interpreters.

So, choose your Webserver according to your needs, taking into account factors like the type of content you’ll be serving, how in-demand it will be, and how comfortable you are managing Linux system administration with that software.

Apache Logs

With Apache, webserver problems can be difficult to troubleshoot, but there are known common issues which will give you clues about where to start. When things get a little trickier Linux administration you might need to look through the Apache error logs.

These are located in the /var/log/apache2/error.log file by default (on Debian-based distributions). You can track or “tail” this log with this command:

tail -F /var/log/apache2/error.log

We suggest you add a custom log setting:

Configuring Apache Virtual Host

`1`	`ErrorLog /var/www/html/example.com/logs/error.log CustomLog /var/www/html/example.com/logs/access.log combined`

Here example.com is a stand-in for the name of your virtual host and the place where its resources are kept. Apache creates two log files with logged information relating to that virtual host, making administration of Linux easier as you troubleshoot errors on specific virtual hosts. To track or tail the error log:

tail -F /var/www/html/example.com/logs/error.log

This displays new error messages when they appear. You can take specific parts of an error message from an Apache log and do a web search to diagnose problems. Common ones include:

Missing files, or mistakes in file names
Permissions errors
Configuration errors
Dynamic code execution or interpretation errors

DNS Servers and Domain Names

DNS stands for Domain Name System, and it’s the service used by the Internet to link the difficult-to-remember chain of numbers in IP addresses with more memorable domain names. This section will look at some DNS-type tasks.

Redirect DNS Queries using CNAMEs

Using CNAME DNS records makes it possible to redirect requests for one hostname or domain to a different hostname or domain. This helps when you need to reroute requests for one domain to a different one, thus avoiding the need to set up a webserver to handle such requests.

CNAMEs only work in relation to redirecting from one domain to another. If you need to point a full URL somewhere else, you’ll have to set up a webserver and do some server-level redirection configuration and/or web hosting. CNAMEs let you redirect subdomains, like team.example.com, to other ones, like jill.example.org. CNAMEs have to point to a valid domain with a valid A Record, or to another CNAME.

Despite some limitations, CNAMEs can be occasionally quite helpful in the administration of Linux, particularly if you need to switch a machine’s hostname.

Setting Up Subdomains

A name that comes before a first-level domain indicates that it’s a subdomain. In team.example.com, team is a subdomain for the root domain example.com.

Follow these steps to create and host a sub-domain:

First, create an A Record for the domain in the DNS zone. You can do this using the DNS Manager. You can host the DNS for your domain with the provider of your choice.
Set up a server to respond to requests sent to this domain. For webservers like Apache, you’ll need to configure a new virtual host. For XMPP servers configure another host to accept the requests for this host. For more information, consult the Linux system administration documentation for the particular server you want to deploy.
Configured subdomains work almost like root domains on your server. You can set up HTTP redirection for the new subdomain if you need to.

SMTP Servers and Email Issues

In this section, we’ll be looking at setting up email to suit your requirements and configuring your system to send email.

Which Email Solution?

Email functionality with Linux administration hinges on two major components. The SMTP server or “Mail Transfer Agent” is the most significant one. The MTA—as it’s known—sends mail between servers. The second part of the system is when a server gets mail to the user’s own machine. These servers often use a protocol like POP3 or IMAP to give remote access to the mailbox.

The email server tool chain can also feature other components, which you might have access to depending on your deployment. They include filtering and delivery tools such as procmail, anti-virus filters such as ClamAV, mailing list managers like MailMan, and spam filters like SpamAssassin. These components work independently of the MTA and remote mailbox server.

The most widely used SMTP servers or MTAs in the UNIX-like arena are Postfix, Exim, and Sendmail. Sendmail is the oldest and lots of Linux adminsitration professionals know it well. Postfix is modern and robust, and it slots into many different configurations. Exim is the standard MTA in Debian systems, and many feel that it’s easier to use for basic tasks. Servers like Courier and Dovecot are also popular for remote mailbox access,

If you’re looking for an email solution that is easy to install, you could take a look at Citadel groupware server. Citadel offers an integrated “turnkey” solution that comes with an SMTP server, remote mailbox access, real time collaboration tools including XMPP, and a shared calendar interface.

If you’re looking for a simpler and modular email stack, it’s worth taking a look at Postfix SMTP server.

Sending Email From Your Server

For simple configurations, you might not need a full email stack, but applications running on that server we’ll still need to be able to send mail for notifications and to meet other day-to-day needs.

We can’t go into configuring applications to send notifications and alerts in this guide, but the majority of applications come with a simple “sendmail” interface, which you can access via several common SMTP servers that include Postfix and msmtp.

To install Postfix on Debian and Ubuntu systems:

apt-get install postfix

On CentOS and Fedora systems:

yum install postfix

When you’ve installed Postfix, your applications should be able to access the sendmail interface, which can be found at /usr/sbin/sendmail. The majority of applications running on your system should be capable of sending mail with this setup.

If you need to use your server to send email through an external SMTP server, you might want to think about a simpler tool like msmtp because it’s included in the majority of distributions, and it can be installed using the appropriate command:

apt-get install msmtp

yum install msmtp

pacman -S msmtp

Use type msmtp or which msmtp, to find where msmtp is on your system (usually at /usr/bin/msmtp). You can set authentication credentials with command line arguments or by declaring SMTP credentials in a configuration file.