tcp(7)
NAME
tcp - TCP protocol.
SYNOPSIS
#include <sys/socket.h>
#include <netinet/in.h>
tcp_socket = socket(PF_INET, SOCK_STREAM, 0);
DESCRIPTION
This is an implementation of the TCP protocol defined in
RFC793, RFC1122 and RFC2001 with the NewReno and SACK
extensions. It provides a reliable, stream oriented, full
duplex connection between two sockets on top of ip(7).
TCP guarantees that the data arrives in order and retrans
mits lost packets. It generates and checks a per packet
checksum to catch transmission errors. TCP does not pre
serve record boundaries.
A fresh TCP socket has no remote or local address and is
not fully specified. To create an outgoing TCP connection
use connect(2) to establish a connection to another TCP
socket. To receive new incoming connections bind(2) the
socket first to a local address and port and then call
listen(2) to put the socket into listening state. After
that a new socket for each incoming connection can be
accepted using accept(2). A socket which has had accept
or connect successfully called on it is fully specified
and may transmit data. Data cannot be transmitted on lis
tening or not yet connected sockets.
Linux 2.2 supports the RFC1323 TCP high performance exten
sions. This includes large TCP windows to support links
with high latency or bandwidth. In order to make use of
them, the send and receive buffer sizes must be increased.
They can be be set globally with the net.core.wmem_default
and net.core.rmem_default sysctls, or on individual sock
ets by using the SO_SNDBUF and SO_RCVBUF socket options.
The maximum sizes for socket buffers are limited by the
global net.core.rmem_max and net.core.wmem_max sysctls.
See socket(7) for more information.
TCP supports urgent data. Urgent data is used to signal
the receiver that some important message is part of the
data stream and that it should be processed as soon as
possible. To send urgent data specify the MSG_OOB option
to send(2). When urgent data is received, the kernel
sends a SIGURG signal to the reading process or the pro
cess or process group that has been set for the socket
using the FIOCSPGRP or FIOCSETOWN ioctls. When the
SO_OOBINLINE socket option is enabled, urgent data is put
into the normal data stream (and can be tested for by the
SIOCATMARK ioctl), otherwise it can be only received when
the MSG_OOB flag is set for sendmsg(2).
ADDRESS FORMATS
TCP is built on top of IP (see ip(7)). The address for
mats defined by ip(7) apply to TCP. TCP only supports
point-to-point communication; broadcasting and multicast
ing are not supported.
SYSCTLS
These sysctls can be accessed by the /proc/sys/net/ipv4/*
files or with the sysctl(2) interface. In addition, most
IP sysctls also apply to TCP; see ip(7).
tcp_window_scaling
Enable RFC1323 TCP window scaling.
tcp_sack
Enable RFC2018 TCP Selective Acknowledgements.
tcp_timestamps
Enable RFC1323 TCP timestamps.
tcp_fin_timeout
How many seconds to wait for a final FIN packet
before the socket is forcibly closed. This is
strictly a violation of the TCP specification, but
required to prevent denial-of-service attacks.
tcp_keepalive_probes
Maximum TCP keep-alive probes to send before giving
up. Keep-alives are only sent when the SO_KEEPALIVE
socket option is enabled.
tcp_keepalive_time
The number of seconds after no data has been trans
mitted before a keep-alive will be sent on a con
nection. The default is 10800 seconds (3 hours).
tcp_max_ka_probes
How many keep-alive probes are sent per slow timer
run. To prevent bursts, this value should not be
set too high.
tcp_stdurg
Enable the strict RFC793 interpretation of the TCP
urgent-pointer field. The default is to use the
BSD-compatible interpretation of the urgent-
pointer, pointing to the first byte after the
urgent data. The RFC793 interpretation is to have
it point to the last byte of urgent data. Enabling
this option may lead to interoperatibility prob
lems.
tcp_syncookies
Enable TCP syncookies. The kernel must be compiled
with CONFIG_SYN_COOKIES. Syncookies protects a
socket from overload when too many connection
attempts arrive. Client machines may not be able to
detect an overloaded machine with a short timeout
anymore when syncookies are enabled.
tcp_max_syn_backlog
Length of the per-socket backlog queue. As of Linux
2.2, the backlog specified in listen(2) only speci
fies the length of the backlog queue of already
established sockets. The maximum queue of sockets
not yet established (in SYN_RECV state) per listen
socket is set by this sysctl. When more connection
requests arrive, Linux starts to drop packets. When
syncookies are enabled the packets are still
answered and this value is effectively ignored.
tcp_retries1
Defines how many times an answer to a TCP connec
tion request is retransmitted before giving up.
tcp_retries2
Defines how many times a TCP packet is retransmit
ted in established state before giving up.
tcp_syn_retries
Defines how many times to try to send an initial
SYN packet to a remote host before giving up and
returns an error. Must be below 255. This is only
the timeout for outgoing connections; for incoming
connections the number of retransmits is defined by
tcp_retries1.
tcp_retrans_collapse
Try to send full-sized packets during retransmit.
This is used to work around TCP bugs in some
stacks.
SOCKET OPTIONS
To set or get a TCP socket option, call getsockopt(2) to
read or setsockopt(2) to write the option with the socket
family argument set to SOL_TCP. In addition, most SOL_IP
socket options are valid on TCP sockets. For more informa
tion see ip(7).
TCP_NODELAY
Turn the Nagle algorithm off. This means that pack
ets are always sent as soon as possible and no
unnecessary delays are introduced, at the cost of
more packets in the network. Expects an integer
boolean flag.
TCP_MAXSEG
Set or receive the maximum segment size for
outgoing TCP packets. If this option is set before
connection establishment, it also changes the MSS
value announced to the other end in the initial
packet. Values greater than the interface MTU are
ignored and have no effect.
TCP_CORK
If enabled don't send out partial frames. All
queued partial frames are sent when the option is
cleared again. This is useful for prepending head
ers before calling sendfile(2), or for throughput
optimization. This option cannot be combined with
TCP_NODELAY.
IOCTLS
These ioctls can be accessed using ioctl(2). The correct
syntax is:
int value;
error = ioctl(tcp_socket, ioctl_type, &value);
FIONREAD
Returns the amount of queued unread data in the
receive buffer. Argument is a pointer to an inte
ger.
SIOCATMARK
Returns true when the all urgent data has been
already received by the user program. This is used
together with SO_OOBINLINE. Argument is an pointer
to an integer for the test result.
TIOCOUTQ
Returns the amount of unsent data in the socket
send queue in the passed integer value pointer.
ERROR HANDLING
When a network error occurs, TCP tries to resend the
packet. If it doesn't succeed after some time, either
ETIMEDOUT or the last received error on this connection is
reported.
Some applications require a quicker error notification.
This can be enabled with the SOL_IP level IP_RECVERR
socket option. When this option is enabled, all incoming
errors are immediately passed to the user program. Use
this option with care - it makes TCP less tolerant to
routing changes and other normal network conditions.
NOTES
When an error occurs doing a connection setup occuring in
a socket write SIGPIPE is only raised when the SO_KEEPOPEN
socket option is set.
TCP has no real out-of-band data; it has urgent data. In
Linux this means if the other end sends newer out-of-band
data the older urgent data is inserted as normal data into
the stream (even when SO_OOBINLINE is not set). This dif
fers from BSD based stacks.
Linux uses the BSD compatible interpretation of the urgent
pointer field by default. This violates RFC1122, but is
required for interoperability with other stacks. It can be
changed by the tcp_stdurg sysctl.
ERRORS
EPIPE The other end closed the socket unexpectedly or a
read is executed on a shut down socket.
ETIMEDOUT
The other end didn't acknowledge retransmitted data
after some time.
EAFNOTSUPPORT
Passed socket address type in sin_family was not
AF_INET.
Any errors defined for ip(7) or the generic socket layer
may also be returned for TCP.
BUGS
Not all errors are documented.
IPv6 is not described.
Transparent proxy options are not described.
VERSIONS
The sysctls are new in Linux 2.2. IP_RECVERR is a new
feature in Linux 2.2. TCP_CORK is new in 2.2.
SEE ALSO
socket(7), socket(2), ip(7), sendmsg(2), recvmsg(2).
RFC793 for the TCP specification.
RFC1122 for the TCP requirements and a description of the
Nagle algorithm.
RFC2001 for some TCP algorithms.
Man(1) output converted with
man2html