ip(7)





NAME

       ip - Linux IPv4 protocol implementation


SYNOPSIS

       #include <sys/socket.h>
       #include <net/netinet.h>

       tcp_socket = socket(PF_INET, SOCK_STREAM, 0);
       raw_socket = socket(PF_INET, SOCK_RAW, protocol);
       udp_socket = socket(PF_INET, SOCK_DGRAM, protocol);


DESCRIPTION

       Linux   implements   the  Internet  Protocol,  version  4,
       described in RFC791 and RFC1122.  ip contains  a  level  2
       multicasting  implementation  conforming  to  RFC1112.  It
       also contains an IP router including a packet filter.

       The programmer's interface is BSD sockets compatible.  For
       more information on sockets, see socket(7).

       An  IP socket is created by calling the socket(2) function
       as socket(PF_INET, socket_type, protocol).   Valid  socket
       types  are SOCK_STREAM to open a tcp(7) socket, SOCK_DGRAM
       to open a udp(7) socket, or  SOCK_RAW  to  open  a  raw(7)
       socket  to  access  the IP protocol directly.  protocol is
       the IP protocol in the IP header to be received  or  sent.
       The  only  valid values for protocol are 0 and IPPROTO_TCP
       for TCP sockets and 0 and  IPPROTO_UDP  for  UDP  sockets.
       For  SOCK_RAW  you  may  specify  a valid IANA IP protocol
       defined in RFC1700 assigned numbers.

       When a process wants to receive new  incoming  packets  or
       connections,  it should bind a socket to a local interface
       address using bind(2).  Only one IP socket may be bound to
       any  given local (address, port) pair.  When INADDR_ANY is
       specified in the bind call the socket will be bound to all
       local  interfaces. When listen(2) or connect(2) are called
       on a unbound socket the socket is automatically bound to a
       random free port with the local address set to INADDR_ANY.

       A TCP local socket address that has been bound is unavail­
       able  for some time after closing, unless the SO_REUSEADDR
       flag has been set.  Care should be taken when  using  this
       flag as it makes TCP less reliable.



ADDRESS FORMAT

       An  IP socket address is defined as a combination of an IP
       interface address and a port number. The basic IP protocol
       does  not  supply  port  numbers,  they are implemented by
       higher level protocols like udp(7)  and  tcp(7).   On  raw
       sockets sin_port is set to the IP protocol.





              struct sockaddr_in {
                  sa_family_t    sin_family; /* address family: AF_INET */
                  u_int16_t      sin_port;   /* port in network byte order */
                  struct in_addr  sin_addr;  /* internet address */
              };

              /* Internet address. */
              struct in_addr {
                  u_int32_t      s_addr;     /* address in network byte order */
              };

       sin_family is always set to AF_INET.  This is required; in
       Linux 2.2 most networking  functions  return  EINVAL  when
       this  setting  is  missing.  sin_port contains the port in
       network byte order. The port numbers below 1024 are called
       reserved  ports.   Only processes with effective user id 0
       or the  CAP_NET_BIND_SERVICE  capability  may  bind(2)  to
       these sockets. Note that the raw IPv4 protocol as such has
       no concept of a port, they are only implemented by  higher
       protocols like tcp(7) and udp(7).

       sin_addr  is  the  IP  host  address.   The addr member of
       struct in_addr contains the host interface address in net­
       work  order.   in_addr  should  be only accessed using the
       inet_aton(3), inet_addr(3), inet_makeaddr(3) library func­
       tions  or  directly with the name resolver (see gethostbyname(3)).  
       IPv4 addresses are divided into unicast, broad­
       cast  and multicast addresses. Unicast addresses specify a
       single interface of a host,  broadcast  addresses  specify
       all hosts on a network and multicast addresses address all
       hosts  in  a  multicast  group.  Datagrams  to   broadcast
       addresses  can be only sent or received when the SO_BROAD­
       CAST socket flag is set.  In  the  current  implementation
       connection  oriented  sockets are only allowed to use uni­
       cast addresses.

       Note that the address and the port are  always  stored  in
       network order.  In particular, this means that you need to
       call htons(3) on the number that is assigned  to  a  port.
       All  address/port  manipulation  functions in the standard
       library work in network order.

       There  are  several  special  addresses:   INADDR_LOOPBACK
       (127.0.0.1)  always refers to the local host via the loop­
       back device; INADDR_ANY (0.0.0.0) means  any  address  for
       binding; INADDR_BROADCAST (255.255.255.255) means any host
       and has the same effect on bind as INADDR_ANY for histori­
       cal reasons.



SOCKET OPTIONS

       IP supports some protocol specific socket options that can
       be set with setsockopt(2)  and  read  with  getsockopt(2).
       The socket option level for IP is SOL_IP



       IP_OPTIONS
              Sets  or  get  the IP options to be sent with every
              packet from  this  socket.   The  arguments  are  a
              pointer  to  a memory buffer containing the options
              and the option length.  The setsockopt(2) call sets
              the IP options associated with a socket.  The maxi­
              mum option size for IPv4 is 40  bytes.  See  RFC791
              for  the  allowed options. When the initial connec­
              tion request packet for a SOCK_STREAM  socket  con­
              tains  IP options, the IP options will be set auto­
              matically to the options from  the  initial  packet
              with  routing  headers  reversed.  Incoming packets
              are not allowed to change options after the connec­
              tion  is established.  The processing of all incom­
              ing source routing options is disabled  by  default
              and can be enabled by using the accept_source_route
              sysctl.  Other options like  timestamps  are  still
              handled.   For  datagram sockets, IP options can be
              only set by the local user.  Calling  getsockopt(2)
              with  IP_OPTIONS  puts  the current IP options used
              for sending into the supplied buffer.


       IP_PKTINFO
              Pass an IP_PKTINFO ancillary message that  contains
              a  pktinfo structure that supplies some information
              about the incoming  packet.  This  only  works  for
              datagram oriented sockets.

              struct in_pktinfo
              {
                  unsigned int   ipi_ifindex;  /* Interface index */
                  struct in_addr ipi_spec_dst; /* Routing destination address */
                  struct in_addr ipi_addr;     /* Header Destination address */
              };

              ipi_ifindex  is  the  unique index of the interface
              the packet was received on.   ipi_spec_dst  is  the
              destination  address of the routing table entry and
              ipi_addr is the destination address in  the  packet
              header.  If IP_PKTINFO is passed to sendmsg(2) then
              the outgoing packet will be sent over the interface
              specified   in  ipi_ifindex  with  the  destination
              address set to ipi_spec_dst


       IP_RECVTOS
              If enabled the IP_TOS ancillary message  is  passed
              with  incoming  packets.  It  contains a byte which
              specifies the Type of Service/Precedence  field  of
              the packet header.  Expects a boolean integer flag.






       IP_RECVTTL
              When this flag is set  pass  a  IP_RECVTTL  control
              message with the time to live field of the received
              packet as a byte.  Not  supported  for  SOCK_STREAM
              sockets.


       IP_RECVOPTS
              Pass  all  incoming  IP  options  to  the user in a
              IP_OPTIONS control message. The routing header  and
              other  options  are already filled in for the local
              host. Not supported for SOCK_STREAM sockets.


       IP_RETOPTS
              Identical to IP_RECVOPTS  but  returns  raw  unpro­
              cessed  options  with  timestamp  and  route record
              options not filled in for this hop.


       IP_TOS Set or receive the Type-Of-Service (TOS) field that
              is  sent with every IP packet originating from this
              socket. It is used to  prioritize  packets  on  the
              network.   TOS  is  a byte. There are some standard
              TOS  flags  defined:  IPTOS_LOWDELAY  to   minimize
              delays for interactive traffic, IPTOS_THROUGHPUT to
              optimize throughput, IPTOS_RELIABILITY to  optimize
              for  reliability,  IPTOS_MINCOST should be used for
              "filler data" where slow transmission doesn't  mat­
              ter.  At most one of these TOS values can be speci­
              fied. Other bits are invalid and shall be  cleared.
              Linux   sends  IPTOS_LOWDELAY  datagrams  first  by
              default, but the exact  behaviour  depends  on  the
              configured queueing discipline.  Some high priority
              levels may require an effective user id of 0 or the
              CAP_NET_ADMIN capability.  The priority can also be
              set  in  a  protocol  independent  way  by  the   (
              SOL_SOCKET,   SO_PRIORITY   )  socket  option  (see
              socket(7) ).


       IP_TTL Set or retrieve the current time to live field that
              is send in every packet send from this socket.


       IP_HDRINCL
              If  enabled the user supplies an ip header in front
              of the user data. Only valid for SOCK_RAW  sockets.
              See  raw(7) for more information. When this flag is
              enabled the values set by  IP_OPTIONS,  IP_TTL  and
              IP_TOS are ignored.






       IP_RECVERR
              Enable  extended  reliable  error  message passing.
              When enabled on a  datagram  socket  all  generated
              errors  will be queued in a per-socket error queue.
              When the user receives an error from a socket oper­
              ation   the  errors  can  be  received  by  calling
              recvmsg(2) with  the  MSG_ERRQUEUE  flag  set.  The
              sock_extended_err  structure  describing  the error
              will be passed in a ancillary message with the type
              IP_RECVERR  and  the  level SOL_IP.  This is useful
              for reliable error handling on unconnected sockets.
              The  received  data portion of the error queue con­
              tains the error packet.

              IP uses the sock_extended_err structure as follows:
              ee_origin  is  set  to SO_EE_ORIGIN_ICMP for errors
              received as an ICMP packet,  or  SO_EE_ORIGIN_LOCAL
              for  locally generated errors.  ee_type and ee_code
              are set from the type and code fields of  the  ICMP
              header.   ee_info  contains  the discovered MTU for
              EMSGSIZE errors.  ee_data is  currently  not  used.
              When  the error originated from the network, all IP
              options (IP_OPTIONS, IP_TTL, etc.) enabled  on  the
              socket and contained in the error packet are passed
              as control messages.  The  payload  of  the  packet
              causing the error is returned as normal data.

              On  SOCK_STREAM  sockets,  IP_RECVERR  has slightly
              different semantics. Instead of saving  the  errors
              for the next timeout, it passes all incoming errors
              immediately to the user. This might be  useful  for
              very  short-lived  TCP  connections which need fast
              error handling. Use this option with care: it makes
              TCP  unreliable by not allowing it to recover prop­
              erly from routing shifts and  other  normal  condi­
              tions  and breaks the protocol specification.  Note
              that TCP has no error queue; MSG_ERRQUEUE is  ille­
              gal  on  SOCK_STREAM  sockets.  Thus all errors are
              returned by  socket  function  return  or  SO_ERROR
              only.

              For  raw sockets, IP_RECVERR enables passing of all
              received ICMP errors to the application,  otherwise
              errors are only reported on connected sockets

              It  sets  or  retrieves  an  integer  boolean flag.
              IP_RECVERR defaults to off.


       IP_PMTU_DISCOVER
              Sets or receives the Path MTU Discovery setting for
              a socket. When enabled, Linux will perform Path MTU
              Discovery as defined in RFC1191 on this socket. The
              don't   fragment   flag  is  set  on  all  outgoing



              datagrams.  The system-wide default  is  controlled
              by the ip_no_pmtu_disc sysctl for SOCK_STREAM sock­
              ets,  and  disabled  on   all   others.   For   non
              SOCK_STREAM sockets it is the user's responsibility
              to packetize the data in MTU sized chunks and to do
              the  retransmits  if  necessary.   The  kernel will
              reject packets that are bigger than the known  path
              MTU if this flag is set (with EMSGSIZE ).

              Path MTU discovery flags   Meaning
              IP_PMTUDISC_WANT           Use per-route settings.
              IP_PMTUDISC_DONT           Never do Path MTU Discovery.
              IP_PMTUDISC_DO             Always do Path MTU Discovery.


              When PMTU discovery is enabled the kernel automati­
              cally keeps track of the path MTU  per  destination
              host.  When it is connected to a specific peer with
              connect(2) the currently  known  path  MTU  can  be
              retrieved  conveniently  using  the  IP_MTU  socket
              option (e.g. after a EMSGSIZE error occurred).   It
              may  change  over time.  For connectionless sockets
              with many destinations the new also MTU for a given
              destination  can  also  be accessed using the error
              queue (see IP_RECVERR).  A new error will be queued
              for every incoming MTU update.

              While  MTU discovery is in progress initial packets
              from datagram sockets may be dropped.  Applications
              using  UDP  should be aware of this and not take it
              into account for their packet retransmit  strategy.

              To  bootstrap  the  path  MTU  discovery process on
              unconnected sockets it is possible to start with  a
              big  datagram  size  (up to 64K-headers bytes long)
              and let it shrink by updates of the path MTU.

              To get an initial estimate of the path MTU  connect
              a  datagram socket to the destination address using
              connect(2) and retrieve the MTU by calling getsock­
              opt(2) with the IP_MTU option.


       IP_MTU Retrieve  the current known path MTU of the current
              socket.  Only valid when the socket has  been  con­
              nected.  Returns  an  integer. Only valid as a get­
              sockopt(2).

       IP_ROUTER_ALERT
              Pass all to-be forwarded packets with the IP Router
              Alert option set to this socket. Only valid for raw
              sockets. This is useful,  for  instance,  for  user
              space RSVP daemons. The tapped packets are not for­
              warded   by   the   kernel,   it   is   the   users



              responsibility to send them out again. Socket bind­
              ing is ignored, such packets are only  filtered  by
              protocol.  Expects an integer flag.

       IP_MULTICAST_TTL
              Set  or  reads  the  time-to-live value of outgoing
              multicast packets  for  this  socket.  It  is  very
              important for multicast packets to set the smallest
              TTL possible.  The default is 1  which  means  that
              multicast  packets  don't  leave  the local network
              unless the user  program  explicitly  requests  it.
              Argument is an integer.

       IP_MULTICAST_LOOP
              Sets  or  reads  a boolean integer argument whether
              sent multicast packets should be looped back to the
              local sockets.

       IP_ADD_MEMBERSHIP
              Join  a  multicast  group.  Argument  is  a  struct
              ip_mreqn structure.

              struct ip_mreqn
              {
                  struct in_addr imr_multiaddr; /* IP multicast group address */
                  struct in_addr imr_address;   /* IP address of local interface */
                  int            imr_ifindex;   /* interface index */
              };

              imr_multiaddr contains the address of the multicast
              group  the  application wants to join or leave.  It
              must be a valid multicast address.  imr_address  is
              the  address  of the local interface with which the
              system should join the multicast group;  if  it  is
              equal  to  INADDR_ANY  an  appropriate interface is
              chosen by the system.  imr_ifindex is the interface
              index  of  the interface that should join/leave the
              imr_multiaddr group, or 0 to  indicate  any  inter­
              face.

              For  compatibility,  the  old  ip_mreq structure is
              still supported. It differs from ip_mreqn  only  by
              not  including the imr_ifindex field. Only valid as
              a setsockopt(2).

       IP_DROP_MEMBERSHIP
              Leave a multicast group. Argument is an ip_mreqn or
              ip_mreq structure similar to IP_ADD_MEMBERSHIP.

       IP_MULTICAST_IF
              Set  the local device for a multicast socket. Argu­
              ment is an ip_mreqn or ip_mreq structure similar to
              IP_ADD_MEMBERSHIP.




              When  an  invalid  socket option is passed, ENOPRO­
              TOOPT is returned.


SYSCTLS

       The IP protocol supports the sysctl interface to configure
       some  global options. The sysctls can be accessed by read­
       ing or writing the /proc/sys/net/ipv4/* files or using the
       sysctl(2) interface.

       ip_default_ttl
              Set  the  default  time-to-live  value  of outgoing
              packets. This can be changed per  socket  with  the
              IP_TTL option.

       ip_forward
              Enable  IP  forwarding with a boolean flag. IP for­
              warding can be also set on a per interface basis.

       ip_dynaddr
              Enable  dynamic  socket  address  and  masquerading
              entry  rewriting  on interface address change. This
              is useful for dialup  interface  with  changing  IP
              addresses.  0 means no rewriting, 1 turns it on and
              2 enables verbose mode.

       ip_autoconfig
              Not documented.

       ip_local_port_range
              Contains two integers that define the default local
              port  range allocated to sockets. Allocation starts
              with the first number and ends with the second num­
              ber.   Note that these should not conflict with the
              ports used by masquerading (although  the  case  is
              handled).  Also arbitary choices may cause problems
              with some firewall packet filters that make assump­
              tions  about  the local ports in use.  First number
              should be at least >1024,  better  >4096  to  avoid
              clashes with well known ports and to minimize fire­
              wall problems.

       ip_no_pmtu_disc
              If enabled, don't do Path  MTU  Discovery  for  TCP
              sockets  by default. Path MTU discovery may fail if
              misconfigured firewalls (that drop all  ICMP  pack­
              ets) or misconfigured interfaces (e.g., a point-to-
              point link where the both ends don't agree  on  the
              MTU)  are on the path. It is better to fix the bro­
              ken routers on the path than to turn off  Path  MTU
              Discovery  globally,  because not doing it incurs a
              high cost to the network.

       ipfrag_high_thresh, ipfrag_low_thresh
              If  the  amount  of  queued  IP  fragments  reaches



              ipfrag_high_thresh  ,  the  queue is pruned down to
              ipfrag_low_thresh .  Contains an integer  with  the
              number of bytes.

       ip_always_defrag
              [New  with Kernel 2.2.13; in earlier kernel version
              the feature was controlled at compile time  by  the
              CONFIG_IP_ALWAYS_DEFRAG option]

              When  this  boolean  frag  is enabled (not equal 0)
              incoming fragments (parts of IP packets that  arose
              when  some  host  between  origin  and  destination
              decided that the packets were  too  large  and  cut
              them  into  pieces)  will  be  reassembled (defrag­
              mented) before being processed, even  if  they  are
              about to be forwarded.

              Only  enable  if  running either a firewall that is
              the sole link to  your  network  or  a  transparent
              proxy;  never ever turn on here for a normal router
              or host. Otherwise fragmented communication may  me
              disturbed when the fragments would travel over dif­
              ferent links. Defragmentation also has a large mem­
              ory and CPU time cost.

              This  is  automagically turned on when masquerading
              or transparent proxying are configured.

       neigh/*
              See arp(7).


IOCTLS

       All ioctls described in socket(7) apply to ip.

       The ioctls to  configure  firewalling  are  documented  in
       ipfw(7) from the ipchains package.

       Ioctls   to   configure   generic  device  parameters  are
       described in netdevice(7).


NOTES

       Be very careful with the SO_BROADCAST option - it  is  not
       privileged  in  Linux.  It is easy to overload the network
       with careless broadcasts. For new application protocols it
       is  better  to use a multicast group instead of broadcast­
       ing. Broadcasting is discouraged.

       Some other BSD sockets  implementations  provide  IP_RCVD­
       STADDR and IP_RECVIF socket options to get the destination
       address and the interface of received datagrams. Linux has
       the more general IP_PKTINFO for the same task.







ERRORS

       ENOTCONN
              The  operation  is  only  defined  on  a  connected
              socket, but the socket wasn't connected.

       EINVAL Invalid argument passed.  For send operations  this
              can be caused by sending to a blackhole route.

       EMSGSIZE
              Datagram  is  bigger than an MTU on the path and it
              cannot be fragmented.

       EACCES The user tried to execute an operation without  the
              necessary  permissions.   These  include: Sending a
              packet to a broadcast address  without  having  the
              SO_BROADCAST flag set.  Sending a packet via a pro­
              hibit route.  Modifying firewall  settings  without
              CAP_NET_ADMIN or effective user id 0.  Binding to a
              reserved  port  without  the   CAP_NET_BIND_SERVICE
              capacibility or effective user id 0.


       EADDRINUSE
              Tried to bind to an address already in use.

       ENOMEM and ENOBUFS
              Not enough memory available.

       ENOPROTOOPT and EOPNOTSUPP
              Invalid socket option passed.

       EPERM  User  doesn't have permission to set high priority,
              change  configuration,  or  send  signals  to   the
              requested process or group,

       EADDRNOTAVAIL
              A  non-existent  interface  was  requested  or  the
              requested source address was not local.

       EAGAIN Operation on a non-blocking socket would block.

       ESOCKTNOSUPPORT
              The socket is not configured or an  unknown  socket
              type was requested.

       EISCONN
              connect(2)  was  called  on  an  already  connected
              socket.

       EALREADY
              An connection operation on a non-blocking socket is
              already in progress.





       ECONNABORTED
              A connection was closed during an accept(2).

       EPIPE  The connection was unexpectedly closed or shut down
              by the other end.

       ENOENT SIOCGSTAMP was called on a socket where  no  packet
              arrived.

       EHOSTUNREACH
              No  valid  routing table entry matches the destina­
              tion address.  This error can be caused by  a  ICMP
              message from a remote router or for the local rout­
              ing table.

       ENODEV Network device not  available  or  not  capable  of
              sending IP.

       ENOPKG A kernel subsystem was not configured.

       ENOBUFS, ENOMEM
              Not  enough free memory.  This often means that the
              memory allocation is limited by the  socket  buffer
              limits,  not  by the system memory, but this is not
              100% consistent.

       Other errors may be generated by the overlaying protocols;
       see tcp(7), raw(7), udp(7) and socket(7).


VERSIONS

       IP_PKTINFO,    IP_MTU,    IP_PMTU_DISCOVER,    IP_PKTINFO,
       IP_RECVERR and IP_ROUTER_ALERT are new  options  in  Linux
       2.2.

       struct  ip_mreqn is new in Linux 2.2.  Linux 2.0 only sup­
       ported ip_mreq.

       The sysctls were introduced with Linux 2.2.


COMPATIBILITY

       For   compatibility   with   Linux   2.0,   the   obsolete
       socket(PF_INET,  SOCK_RAW,  protocol) syntax is still sup­
       ported to open a packet(7) socket. This is deprecated  and
       should  be  replaced by socket(PF_PACKET, SOCK_RAW, proto­
       col) instead. The main difference is the  new  sockaddr_ll
       address  structure  for  generic  link  layer  information
       instead of the old sockaddr_pkt.


BUGS

       There are too many inconsistent error values.

       The ioctls to configure IP-specific interface options  and
       ARP tables are not described.





AUTHORS

       This man page was written by Andi Kleen.


SEE ALSO

       sendmsg(2),  recvmsg(2),  socket(7),  netlink(7),  tcp(7),
       udp(7), raw(7), ipfw(7).

       RFC791 for the original IP specification.
       RFC1122 for the IPv4 host requirements.
       RFC1812 for the IPv4 router requirements.















































Man(1) output converted with man2html