Skip to content

Introduction of gtp5g and some kernel concepts

Note

Author: Jimmy
Date: 2023/9/20


Overview

GTP (General Packet Radio System Tunneling Protocol) is a group of IP-based communication protocols used to transport General Packet Radio Services (GPRS) within LTE, 5G NR, and other networks. GTP can be broken down into several components: GTP-C, GTP-U, and GTP prime. GTP-C and GTP-U are responsible for the control plane and user plane, respectively, while GTP prime is utilized for transmitting charging data through the Ga interface as defined in the 3GPP GPRS Core Network.

In the context of free5GC, the UPF network function combines the GTP-C control part to correctly instruct the routing path for any packet passing through the core network. GTP-U, on the other hand, is managed by gtp5g, which transports packets using kernel modules generated by gtp5g.
This article will introduce how gtp5g assists free5GC in handling packets and some kernel-related concepts.

Let's start the journey!

Additional information:

  • Linux kernel version is 5.4.0-159-generic in article. According to other versions, some of content would be different, but the main concept is the same.
  • Gtp5g version is v0.8.2
  • UPF version is v1.2.0

Before we continue, I need to introduce Netlink first. What is Netlink? Netlink is an IPC (Inter Process Communication) protocol which can connect kernel space and user space processes by socket. Traditionally, it used three methods: Ioctl, sysfs, or procfs, to facilitate communication between the kernel and user space. However, it can only be initiated from user space, not from kernel space. Netlink can support not only initiated from kernel and user space but also:

  • Bidirectional transmission, asynchronous communication.
  • Standard socket API used in user space.
  • Specialized API used in kernel space.
  • Support for multicast.
  • Support for 32 protocol types.

There are servel usages define in include/uapi/linux/netlink.h

#define NETLINK_ROUTE       0   /* Routing/device hook              */
#define NETLINK_UNUSED      1   /* Unused number                */
#define NETLINK_USERSOCK    2   /* Reserved for user mode socket protocols  */
#define NETLINK_FIREWALL    3   /* Unused number, formerly ip_queue     */
#define NETLINK_SOCK_DIAG   4   /* socket monitoring                */
#define NETLINK_NFLOG       5   /* netfilter/iptables ULOG */
#define NETLINK_XFRM        6   /* ipsec */
#define NETLINK_SELINUX     7   /* SELinux event notifications */
#define NETLINK_ISCSI       8   /* Open-iSCSI */
#define NETLINK_AUDIT       9   /* auditing */
#define NETLINK_FIB_LOOKUP  10  
#define NETLINK_CONNECTOR   11
#define NETLINK_NETFILTER   12  /* netfilter subsystem */
#define NETLINK_IP6_FW      13
#define NETLINK_DNRTMSG     14  /* DECnet routing messages */
#define NETLINK_KOBJECT_UEVENT  15  /* Kernel messages to userspace */
#define NETLINK_GENERIC     16
/* leave room for NETLINK_DM (DM Events) */
#define NETLINK_SCSITRANSPORT   18  /* SCSI Transports */
#define NETLINK_ECRYPTFS    19
#define NETLINK_RDMA        20
#define NETLINK_CRYPTO      21  /* Crypto layer */
#define NETLINK_SMC     22  /* SMC monitoring */

#define NETLINK_INET_DIAG   NETLINK_SOCK_DIAG

#define MAX_LINKS 32

These are Linux system pre-defined Netlink protocols. Therefore, if users want to define their own Netlink protocol, they would need to modify the Linux kernel files to meet their requirements. However, the kernel must be protected from modification. Additionally, the maximum protocol number allowed is 32, can't exceed it.

Due to the shortage of protocol numbers and the need to prevent kernel modification, kernel developers extended Netlink and introduced Generic Netlink. Generic Netlink supports 1023 protocols, addressing the protocol number limitation, and it allocates protocol IDs automatically.

The following figure is Generic Netlink structure:

graph TD
A1[Application_1] --- B[Kernel_Socket_API] 
A2[Application_2] --- B[Kernel_Socket_API]

B[Kernel_Socket_API] --- C[Netlink_Subsystem]
B[Kernel_Socket_API] --- C[Netlink_Subsystem]

C[Netlink_Subsystem] --- D[Generic_Netlink_Bus]

D[Generic_Netlink_Bus] --- E1[Controller]
D[Generic_Netlink_Bus] --- E2[Kernel_User_1]
D[Generic_Netlink_Bus] --- E3[Kernel_User_2]
  • The Generic Netlink users application_1 and application_2 could communicate both user space and kernel space endpoint through Kernel_socket_API.
  • The Netlink subsystem which serves as the underlying transport layer for all of the Generic Netlink communications.
  • The Generic Netlink bus which is implemented inside the kernel, but which is available to userspace through the socket API and inside the kernel via the normal Netlink and Generic Netlink APIs.
  • The Generic Netlink users who communicate with each other over the Generic Netlink bus; users can exist both in kernel and user space.
  • The Generic Netlink controller which is part of the kernel and is responsible for dynamically allocating Generic Netlink communication channels and other management tasks. The Generic Netlink controller is implemented as a standard Generic Netlink kernel user, however, it listens on a special, pre-allocated Generic Netlink channel.
  • The kernel socket API. Generic Netlink sockets are created with the PF_NETLINK domain and the NETLINK_GENERIC protocol values.

The last one is rtnetlink, it also known as Netlink protocol type NETLINK_ROUTE, user space program could read and alter kernel's routing table or create new network device.

free5GC UPF

Since gtp5g is part of UPF logically, article also covers part of UPF.

The Driver provides functions to communicate with gtp5g (the functions are one-to-one match to gtp5g_genl_ops[] in genl.c). So, when UPF receives a PFCP message, it parses the content and then uses various functions of the Driver to instruct gtp5g to take regarding rules.

// internel/forwarder/driver.go
type Driver interface {
    Close()

    CreatePDR(uint64, *ie.IE) error
    UpdatePDR(uint64, *ie.IE) error
    RemovePDR(uint64, *ie.IE) error

    CreateFAR(uint64, *ie.IE) error
    UpdateFAR(uint64, *ie.IE) error
    RemoveFAR(uint64, *ie.IE) error

    CreateQER(uint64, *ie.IE) error
    UpdateQER(uint64, *ie.IE) error
    RemoveQER(uint64, *ie.IE) error

    CreateURR(uint64, *ie.IE) error
    UpdateURR(uint64, *ie.IE) ([]report.USAReport, error)
    RemoveURR(uint64, *ie.IE) ([]report.USAReport, error)
    QueryURR(uint64, uint32) ([]report.USAReport, error)

    CreateBAR(uint64, *ie.IE) error
    UpdateBAR(uint64, *ie.IE) error
    RemoveBAR(uint64, *ie.IE) error

    HandleReport(report.Handler)
}

UPF use rtnl to create device (interface) named upfgtp. User can observe it while executing run.sh.

func OpenGtp5gLink(mux *nl.Mux, addr string, mtu uint32, log *logrus.Entry) (*Gtp5gLink, error) {
    g := &Gtp5gLink{
        log: log,
    }

    g.mux = mux

    rtconn, err := nl.Open(syscall.NETLINK_ROUTE)
    if err != nil {
        return nil, errors.Wrap(err, "open")
    }
    g.rtconn = rtconn
    g.client = nl.NewClient(rtconn, mux)

    laddr, err := net.ResolveUDPAddr("udp4", addr)
    if err != nil {
        g.Close()
        return nil, errors.Wrap(err, "resolve addr")
    }
    conn, err := net.ListenUDP("udp4", laddr)
    if err != nil {
        g.Close()
        return nil, errors.Wrap(err, "listen")
    }
    g.conn = conn

    // TODO: Duplicate fd
    f, err := conn.File()
    if err != nil {
        g.Close()
        return nil, errors.Wrap(err, "file")
    }
    g.f = f

    linkinfo := &nl.Attr{
        Type: syscall.IFLA_LINKINFO,
        Value: nl.AttrList{
            {
                Type:  rtnllink.IFLA_INFO_KIND,
                Value: nl.AttrString("gtp5g"),
            },
            {
                Type: rtnllink.IFLA_INFO_DATA,
                Value: nl.AttrList{
                    {
                        Type:  gtp5gnl.IFLA_FD1,
                        Value: nl.AttrU32(f.Fd()),
                    },
                    {
                        Type:  gtp5gnl.IFLA_HASHSIZE,
                        Value: nl.AttrU32(131072),
                    },
                },
            },
        },
    }
    attrs := []*nl.Attr{linkinfo}

    if mtu != 0 {
        attrs = append(attrs, &nl.Attr{
            Type:  syscall.IFLA_MTU,
            Value: nl.AttrU32(mtu),
        })
    }

    err = rtnllink.Create(g.client, "upfgtp", attrs...)
    if err != nil {
        g.Close()
        return nil, errors.Wrap(err, "create")
    }
    err = rtnllink.Up(g.client, "upfgtp")
    if err != nil {
        g.Close()
        return nil, errors.Wrap(err, "up")
    }
    link, err := gtp5gnl.GetLink("upfgtp")
    if err != nil {
        g.Close()
        return nil, errors.Wrap(err, "get link")
    }
    g.link = link
    return g, nil
}

Connect UPF Driver functions and gtp5g_genl_ops.

// internl/forwarder/buffnetlink/server.go
func OpenServer(wg *sync.WaitGroup, client *nl.Client, mux *nl.Mux) (*Server, error) {
    s := &Server{
        client: client,
        mux:    mux,
    }

    f, err := genl.GetFamily(s.client, "gtp5g")
    if err != nil {
        return nil, errors.Wrap(err, "get family")
    }

    s.conn, err = nl.Open(syscall.NETLINK_GENERIC, int(f.Groups[gtp5gnl.GENL_MCGRP].ID))
    if err != nil {
        return nil, errors.Wrap(err, "open netlink")
    }

    err = s.mux.PushHandler(s.conn, s)
    if err != nil {
        return nil, errors.Wrap(err, "push handler")
    }

    logger.BuffLog.Infof("buff netlink server started")

    // wg.Add(1)
    return s, nil
}

GTP5G

  • Gtp5g utilizes a Linux kernel module to manage packet traffic. A Linux kernel module can be thought of as a small piece of code that is inserted into the Linux kernel, allowing users to customize the program according to the current hardware device
  • In gtp5g, the primary function is gtp5g_init in gtp5g.c; it exposes most of the components and techniques provided by gtp5g. This article will choose the following concepts to investigate further:
  • Network device -> net_device_ops
  • Rtnetlink -> gtp5g_link_ops
  • Generic Netlink -> gtp5g_genl_family
  • Additionally, the article will present two functions in detail:
  • rtnl_link_register()
  • genl_register_family()
// src/gtp5g.c
static int __init gtp5g_init(void)
{
    int err;

    GTP5G_LOG(NULL, "Gtp5g Module initialization Ver: %s\n", DRV_VERSION);

    init_proc_gtp5g_dev_list();

    // set hash initial value
    get_random_bytes(&gtp5g_h_initval, sizeof(gtp5g_h_initval));

    err = rtnl_link_register(&gtp5g_link_ops);
    if (err < 0) {
        GTP5G_ERR(NULL, "Failed to register rtnl\n");
        goto error_out;
    }

    err = genl_register_family(&gtp5g_genl_family);
    if (err < 0) {
        GTP5G_ERR(NULL, "Failed to register generic\n");
        goto unreg_rtnl_link;
    }

    err = register_pernet_subsys(&gtp5g_net_ops);
    if (err < 0) {
        GTP5G_ERR(NULL, "Failed to register namespace\n");
        goto unreg_genl_family;
    }

    err = create_proc();
    if (err < 0) {
        goto unreg_pernet;
    }
    GTP5G_LOG(NULL, "5G GTP module loaded\n");

    return 0;
    ...
}

net_device_ops

It is defined in dev.h and referenced in dev.c. The structure net_device_ops encompasses all operations related to network device, and free5GC inherits some of these operations to implement self-made netdev ops.

// include/dev.h
extern const struct net_device_ops gtp5g_netdev_ops;

// src/gtpu/dev.c
const struct net_device_ops gtp5g_netdev_ops = {
    .ndo_init           = gtp5g_dev_init,
    .ndo_uninit         = gtp5g_dev_uninit,
    .ndo_start_xmit     = gtp5g_dev_xmit,
#if LINUX_VERSION_CODE >= KERNEL_VERSION(5, 11, 0)
    .ndo_get_stats64    = dev_get_tstats64,
#else
    .ndo_get_stats64    = ip_tunnel_get_stats64,
#endif
};

According to /include/linux/netdevice.h, you can find the definition of hooks of net_device_ops:

  • .ndo_init: This function is called once when a network device is registered. The network device can use this for any late stage initialization or semantic validation. It can fail with an error code which will be propagated back to register_netdev.
  • .ndo_uninit: This function is called when device is unregistered or when registration fails. It is not called if init fails.
  • .ndo_start_xmit: Called when a packet needs to be transmitted. Returns NETDEV_TX_OK. Can return NETDEV_TX_BUSY, but you should stop the queue before that can happen; it's for obsolete devices and weird corner cases, but the stack really does a non-trivial amount of useless work if you return NETDEV_TX_BUSY. Required; cannot be NULL.
    struct net_device_ops {
      int           (*ndo_init)(struct net_device *dev);
      void          (*ndo_uninit)(struct net_device *dev);
      netdev_tx_t       (*ndo_start_xmit)(struct sk_buff *skb,
                          struct net_device *dev);
    ...
    }
    

    Gtp5g self-made structure:
    // include/dev.h
    struct gtp5g_dev {
        struct list_head list;
        struct sock *sk1u; // UDP socket from user space
        struct net_device *dev;
        unsigned int role;
        unsigned int hash_size;
        struct hlist_head *pdr_id_hash;
        struct hlist_head *far_id_hash;
        struct hlist_head *qer_id_hash;
        struct hlist_head *bar_id_hash;
        struct hlist_head *urr_id_hash;
    
        struct hlist_head *i_teid_hash; // Used for GTP-U packet detect
        struct hlist_head *addr_hash;   // Used for IPv4 packet detect
    
        /* IEs list related to PDR */
        struct hlist_head *related_far_hash; // PDR list waiting the FAR to handle
        struct hlist_head *related_qer_hash; // PDR list waiting the QER to handle
        struct hlist_head *related_bar_hash;
        struct hlist_head *related_urr_hash;
    
        /* Used by proc interface */
        struct list_head proc_list;
    };
    

It would find the private data address in network device by netdev_priv() and allocate the device statistics space for each CPU by netdev_alloc_pcpu_stats():

// src/gtpu/dev.c
static int gtp5g_dev_init(struct net_device *dev)
{
    struct gtp5g_dev *gtp = netdev_priv(dev);

    gtp->dev = dev;

    dev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats);
    if (!dev->tstats) {
        return -ENOMEM;
    }

    return 0;
}

From /include/linux/netdevice.h, the return value would be times of 32:

static inline void *netdev_priv(const struct net_device *dev)
{
    return (char *)dev + ALIGN(sizeof(struct net_device), NETDEV_ALIGN);
}

Close the udp socket (sk1u) is used to receive uplink (N3) packet:

// src/gtpu/dev.c
static void gtp5g_dev_uninit(struct net_device *dev)
{
    struct gtp5g_dev *gtp = netdev_priv(dev);

    gtp5g_encap_disable(gtp->sk1u);
    free_percpu(dev->tstats);
}

Utilized for the reception of downlink (N6) packets by a network device:

// src/gtpu/dev.c
static netdev_tx_t gtp5g_dev_xmit(struct sk_buff *skb, struct net_device *dev)
{
    unsigned int proto = ntohs(skb->protocol);
    struct gtp5g_pktinfo pktinfo;
    int ret = 0;

    /* Ensure there is sufficient headroom */
    if (skb_cow_head(skb, dev->needed_headroom)) {
        goto tx_err;
    }

    skb_reset_inner_headers(skb);

    /* PDR lookups in gtp5g_build_skb_*() need rcu read-side lock. 
     * */
    rcu_read_lock();
    switch (proto) {
    case ETH_P_IP:
        ret = gtp5g_handle_skb_ipv4(skb, dev, &pktinfo);
        break;
    default:
        ret = -EOPNOTSUPP;
    }
    rcu_read_unlock();

    if (ret < 0)
        goto tx_err;

    if (ret == FAR_ACTION_FORW)
        gtp5g_xmit_skb_ipv4(skb, &pktinfo);

    return NETDEV_TX_OK;
    ...
}

Ignore headder file here, structure defines Rtnetlink operations.

// src/gtpu/link.c
struct rtnl_link_ops gtp5g_link_ops __read_mostly = {
    .kind         = "gtp5g",
    .maxtype      = IFLA_GTP5G_MAX,
    .policy       = gtp5g_policy,
    .priv_size    = sizeof(struct gtp5g_dev),
    .setup        = gtp5g_link_setup,
    .validate     = gtp5g_validate,
    .newlink      = gtp5g_newlink,
    .dellink      = gtp5g_dellink,
    .get_size     = gtp5g_get_size,
    .fill_info    = gtp5g_fill_info,
};

Definition is in /include/net/rtnetlink.h:
- .kind: Identifier
- .maxtype: Highest device specific netlink attribute number
- .policy: Netlink policy for device specific attribute validation
- .priv_size: sizeof net_device private space
- .setup: net_device setup function
- .validate: Optional validation function for netlink/changelink parameters
- .newlink: Function for configuring and registering a new device
- .dellink: Function to remove a device
- .get_size: Function to calculate required room for dumping device specific netlink attributes
- .fill_info: Function to dump device specific netlink attributes

struct rtnl_link_ops {
    struct list_head    list;

    const char      *kind;

    size_t          priv_size;
    void            (*setup)(struct net_device *dev);

    unsigned int        maxtype;
    const struct nla_policy *policy;
    int         (*validate)(struct nlattr *tb[],
                        struct nlattr *data[],
                        struct netlink_ext_ack *extack);

    int         (*newlink)(struct net *src_net,
                       struct net_device *dev,
                       struct nlattr *tb[],
                       struct nlattr *data[],
                       struct netlink_ext_ack *extack);
    int         (*changelink)(struct net_device *dev,
                          struct nlattr *tb[],
                          struct nlattr *data[],
                          struct netlink_ext_ack *extack);
    void            (*dellink)(struct net_device *dev,
                       struct list_head *head);

    size_t          (*get_size)(const struct net_device *dev);
    int         (*fill_info)(struct sk_buff *skb,
                         const struct net_device *dev);

    size_t          (*get_xstats_size)(const struct net_device *dev);
    int         (*fill_xstats)(struct sk_buff *skb,
                           const struct net_device *dev);
    unsigned int        (*get_num_tx_queues)(void);
    unsigned int        (*get_num_rx_queues)(void);

    unsigned int        slave_maxtype;
    const struct nla_policy *slave_policy;
    int         (*slave_changelink)(struct net_device *dev,
                            struct net_device *slave_dev,
                            struct nlattr *tb[],
                            struct nlattr *data[],
                            struct netlink_ext_ack *extack);
    size_t          (*get_slave_size)(const struct net_device *dev,
                          const struct net_device *slave_dev);
    int         (*fill_slave_info)(struct sk_buff *skb,
                           const struct net_device *dev,
                           const struct net_device *slave_dev);
    struct net      *(*get_link_net)(const struct net_device *dev);
    size_t          (*get_linkxstats_size)(const struct net_device *dev,
                               int attr);
    int         (*fill_linkxstats)(struct sk_buff *skb,
                           const struct net_device *dev,
                           int *prividx, int attr);
};

Once rtnl link setting up, gtp5g would assign net_device_ops to device.

static void gtp5g_link_setup(struct net_device *dev)
{
    dev->netdev_ops = &gtp5g_netdev_ops;   <---- network device assignment
    dev->needs_free_netdev = true;

    dev->hard_header_len = 0;
    dev->addr_len = 0;
    dev->mtu = ETH_DATA_LEN -
        (sizeof(struct iphdr) +
         sizeof(struct udphdr) +
         sizeof(struct gtpv1_hdr));

    /* Zero header length. */
    dev->type = ARPHRD_NONE;
    dev->flags = IFF_POINTOPOINT | IFF_NOARP | IFF_MULTICAST;

    dev->priv_flags |= IFF_NO_QUEUE;
    dev->features |= NETIF_F_LLTX;
    netif_keep_dst(dev);

    /* TODO: Modify the headroom size based on
     * what are the extension header going to support
     * */
    dev->needed_headroom = LL_MAX_HEADER +
        sizeof(struct iphdr) +
        sizeof(struct udphdr) +
        sizeof(struct gtpv1_hdr);
}

  • Definition is in /net/core/rtnetlink.c
  • This function should be used by drivers that create devices during module initialization. It must be called before registering the devices
  • Using the kind property of rtnl_link_ops to search for the existence of ops in link_ops. If ops do not exist, then inserted it at the end of link_ops
  • Once register sucess, UPF can create new network device (interface) using Rtnetlink socket
    int __rtnl_link_register(struct rtnl_link_ops *ops)
    {
        if (rtnl_link_ops_get(ops->kind))
            return -EEXIST;
    
        /* The check for setup is here because if ops
         * does not have that filled up, it is not possible
         * to use the ops for creating device. So do not
         * fill up dellink as well. That disables rtnl_dellink.
         */
        if (ops->setup && !ops->dellink)
            ops->dellink = unregister_netdevice_queue;
    
        list_add_tail(&ops->list, &link_ops);
        return 0;
    }
    EXPORT_SYMBOL_GPL(__rtnl_link_register);
    

gtp5g_genl_family

Gtp5g defines the genl (Generic Netlink) interface to facilitate communication between user and kernel space after register 'family'. As mentioned earlier, there is a Generic Netlink Controller responsible for bus allocation and dynamically assigns tunnel based on genl family id (name).

// src/genl/genl.c
struct genl_family gtp5g_genl_family __ro_after_init = {
    .name       = "gtp5g",
    .version    = 0,
    .hdrsize    = 0,
    .maxattr    = GTP5G_ATTR_MAX,
    .netnsok    = true,
    .module     = THIS_MODULE,
    .ops        = gtp5g_genl_ops,
    .n_ops      = ARRAY_SIZE(gtp5g_genl_ops),
    .mcgrps     = gtp5g_genl_mcgrps,
    .n_mcgrps   = ARRAY_SIZE(gtp5g_genl_mcgrps),
};

Definition is in /include/net/genetlink.h:
- .name: name of family (exclusive)
- .version: protocol version (usually is 1)
- .hdrsize: length of user specific header in bytes
- .maxattr: maximum number of attributes supported
- .netnsok: set to true if the family can handle network namespaces and should be presented in all of them
- .ops: the operations supported by this family
- .n_ops: number of operations supported by this family
- .mcgrps: multicast groups used by this family
- .n_mcgrps: number of multicast groups

struct genl_family {
    int         id;     /* private */
    unsigned int        hdrsize;
    char            name[GENL_NAMSIZ];
    unsigned int        version;
    unsigned int        maxattr;
    bool            netnsok;
    bool            parallel_ops;
    const struct nla_policy *policy;
    int         (*pre_doit)(const struct genl_ops *ops,
                        struct sk_buff *skb,
                        struct genl_info *info);
    void            (*post_doit)(const struct genl_ops *ops,
                         struct sk_buff *skb,
                         struct genl_info *info);
    struct nlattr **    attrbuf;    /* private */
    const struct genl_ops * ops;
    const struct genl_multicast_group *mcgrps;
    unsigned int        n_ops;
    unsigned int        n_mcgrps;
    unsigned int        mcgrp_offset;   /* private */
    struct module       *module;
};

Gtp5g defines gtp5g_genl_ops, all operations are one-to-one match to Driver functions in UPF part:

// src/genl/genl.c
static const struct genl_ops gtp5g_genl_ops[] = {
    {
        .cmd = GTP5G_CMD_ADD_PDR,
        // .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
        .doit = gtp5g_genl_add_pdr,
        // .policy = gtp5g_genl_pdr_policy,
        .flags = GENL_ADMIN_PERM,
    },
    {
        .cmd = GTP5G_CMD_DEL_PDR,
        // .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
        .doit = gtp5g_genl_del_pdr,
        // .policy = gtp5g_genl_pdr_policy,
        .flags = GENL_ADMIN_PERM,
    },
    {
        .cmd = GTP5G_CMD_GET_PDR,
        // .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
        .doit = gtp5g_genl_get_pdr,
        .dumpit = gtp5g_genl_dump_pdr,
        // .policy = gtp5g_genl_pdr_policy,
        .flags = GENL_ADMIN_PERM,
    },
    ...
}

genl_register_family()

  • Definition is in /net/netlink/genetlink.c
  • Registers the specified family after validating it first. Only one family may be registered with the same family name or identifier. The family's ops, multicast groups and module pointer must already be assigned. Return 0 on success or a negative error code
  • Three functions within this method hold greater significance (all of them in /net/netlink/genetlink.c):
  • genl_validate_ops()
  • genl_family_find_byname()
  • genl_validate_assign_mc_groups()
int genl_register_family(struct genl_family *family)
{
    int err, i;
    int start = GENL_START_ALLOC, end = GENL_MAX_ID;

    err = genl_validate_ops(family);
    if (err)
        return err;

    genl_lock_all();

    if (genl_family_find_byname(family->name)) {
        err = -EEXIST;
        goto errout_locked;
    }

    if (family == &genl_ctrl) {
        start = end = GENL_ID_CTRL;
    } else if (strcmp(family->name, "pmcraid") == 0) {
        start = end = GENL_ID_PMCRAID;
    } else if (strcmp(family->name, "VFS_DQUOT") == 0) {
        start = end = GENL_ID_VFS_DQUOT;
    }

    if (family->maxattr && !family->parallel_ops) {
        family->attrbuf = kmalloc_array(family->maxattr + 1,
                        sizeof(struct nlattr *),
                        GFP_KERNEL);
        if (family->attrbuf == NULL) {
            err = -ENOMEM;
            goto errout_locked;
        }
    } else
        family->attrbuf = NULL;

    family->id = idr_alloc_cyclic(&genl_fam_idr, family,
                      start, end + 1, GFP_KERNEL);
    if (family->id < 0) {
        err = family->id;
        goto errout_free;
    }

    err = genl_validate_assign_mc_groups(family);
    if (err)
        goto errout_remove;

    genl_unlock_all();

    /* send all events */
    genl_ctrl_event(CTRL_CMD_NEWFAMILY, family, NULL, 0);
    for (i = 0; i < family->n_mcgrps; i++)
        genl_ctrl_event(CTRL_CMD_NEWMCAST_GRP, family,
                &family->mcgrps[i], family->mcgrp_offset + i);

    return 0;

errout_remove:
    idr_remove(&genl_fam_idr, family->id);
errout_free:
    kfree(family->attrbuf);
errout_locked:
    genl_unlock_all();
    return err;
}
EXPORT_SYMBOL(genl_register_family);

genl_validate_ops()

This function will verify if there is defined function for the operations and will also compare whether any operation is reused by a command. Using gtp5g as an example, command can be considered as action such as add, del, modify PDR rules.

static int genl_validate_ops(const struct genl_family *family)
{
    const struct genl_ops *ops = family->ops;
    unsigned int n_ops = family->n_ops;
    int i, j;

    if (WARN_ON(n_ops && !ops))
        return -EINVAL;

    if (!n_ops)
        return 0;

    for (i = 0; i < n_ops; i++) {
        if (ops[i].dumpit == NULL && ops[i].doit == NULL)
            return -EINVAL;
        for (j = i + 1; j < n_ops; j++)
            if (ops[i].cmd == ops[j].cmd)
                return -EINVAL;
    }

    return 0;
}

genl_family_find_byname()

Function would check every entry in genl_fam_idr whether exists the same family name.

static const struct genl_family *genl_family_find_byname(char *name)
{
    const struct genl_family *family;
    unsigned int id;

    idr_for_each_entry(&genl_fam_idr, family, id)
        if (strcmp(family->name, name) == 0)
            return family;

    return NULL;
}

Here is the macro of idr_for_each_entry():

#define idr_for_each_entry(idr, entry, id)          \
    for (id = 0; ((entry) = idr_get_next(idr, &(id))) != NULL; id += 1U)

genl_validate_assign_mc_groups()

This changes the number of multicast groups that are available when ntensok is ture:

static int genl_validate_assign_mc_groups(struct genl_family *family)
{
    int first_id;
    int n_groups = family->n_mcgrps;
    int err = 0, i;
    bool groups_allocated = false;

    if (!n_groups)
        return 0;

    for (i = 0; i < n_groups; i++) {
        const struct genl_multicast_group *grp = &family->mcgrps[i];

        if (WARN_ON(grp->name[0] == '\0'))
            return -EINVAL;
        if (WARN_ON(memchr(grp->name, '\0', GENL_NAMSIZ) == NULL))
            return -EINVAL;
    }

    /* special-case our own group and hacks */
    if (family == &genl_ctrl) {
        first_id = GENL_ID_CTRL;
        BUG_ON(n_groups != 1);
    } else if (strcmp(family->name, "NET_DM") == 0) {
        first_id = 1;
        BUG_ON(n_groups != 1);
    } else if (family->id == GENL_ID_VFS_DQUOT) {
        first_id = GENL_ID_VFS_DQUOT;
        BUG_ON(n_groups != 1);
    } else if (family->id == GENL_ID_PMCRAID) {
        first_id = GENL_ID_PMCRAID;
        BUG_ON(n_groups != 1);
    } else {
        groups_allocated = true;
        err = genl_allocate_reserve_groups(n_groups, &first_id);
        if (err)
            return err;
    }

    family->mcgrp_offset = first_id;

    /* if still initializing, can't and don't need to to realloc bitmaps */
    if (!init_net.genl_sock)
        return 0;

    if (family->netnsok) {
        struct net *net;

        netlink_table_grab();
        rcu_read_lock();
        for_each_net_rcu(net) {
            err = __netlink_change_ngroups(net->genl_sock,
                    mc_groups_longs * BITS_PER_LONG);
            if (err) {
                /*
                 * No need to roll back, can only fail if
                 * memory allocation fails and then the
                 * number of _possible_ groups has been
                 * increased on some sockets which is ok.
                 */
                break;
            }
        }
        rcu_read_unlock();
        netlink_table_ungrab();
    } else {
        err = netlink_change_ngroups(init_net.genl_sock,
                         mc_groups_longs * BITS_PER_LONG);
    }

    if (groups_allocated && err) {
        for (i = 0; i < family->n_mcgrps; i++)
            clear_bit(family->mcgrp_offset + i, mc_groups);
    }

    return err;
}

About

  • Jimmy Chang
    • Graduate student majoring in 5GC Research
    • As I am a beginner in the Linux kernel, please feel free to send me an email if you find any errors.
    • mail
    • Linkin

Reference

  • https://bootlin.com
  • https://www.linuxjournal.com/article/8498
  • https://wiki.linuxfoundation.org/networking/generic_netlink_howto
  • https://www.kernel.org/doc/html/latest/driver-api/driver-model/overview.html
  • https://www.cnblogs.com/ssyfj/p/16230540.html
  • https://www.twblogs.net/a/5b81e5852b71772165aedd7a
  • https://www.cnblogs.com/ssyfj/p/16230540.html
  • IT blog by Ian Chen
  • IT blog by 0xff07

Note

If you are interested in supporting free5GC, we welcome your donation. Please visit this link for more details.