
                                          May 1995

         The QNX 4  Network "Raw Packet" Interface
         -----------------------------------------

In addition to transmitting QNX packets over the various
local area networks supported by QNX, there exists an 
operating system interface, or "hook" which allows a
user-written program to transmit and receive non-QNX
(we call them raw) packets over the network.

Let's look at ethernet.  An ethernet packet looks like
the following:

-------------------------------------------------------
| dst_nid | src_nid | type/length |    data    | crc  |
-------------------------------------------------------
  6 bytes   6 bytes      2 bytes    1500 bytes  4 bytes

The type/length field is a odd one.  In the old
digital/intel/xerox (DIX) ethernet specification, it
was defined as a "type", which allowed drivers to figure
out what protocol a packet belongs to when they received
it.  There are many defined protocol "types", but here
are a couple relevant ones:

    0x0800  IP
    0x0806  ARP
    0x6004  DEC LAT
    0x8014  SGI network games
    0x809B  Appletalk
    0x8137  Novell
    0x8138  Novell
    0x814C  SNMP
    0x8203  QNX
 
Then, the Institute of Electrical And Electronic Engineers
jumped in with their specification of ethernet, IEEE 802.3.

They changed some electrical stuff, which everyone pretty
much went along with.  But they also slightly changed the
packet layout, replacing the two-byte "type" with a two-byte
"length" of the following "data".  Few people actually use
the intended IEEE 802.3 packet layout.

Fortunately, a valid "length" is always less than the smallest
possible "type" of 0x800 or 2048, so when someone receives
a packet they can always distinguish between an old DIX v2
packet with a "type", and an IEEE 802.3 packet with a "length".

We actually tried to be "standard" and use an IEEE 802.3 "length"
in our QNX ethernet packets early on, but ran into problems
because we weren't the way everyone else (like TCP/IP) was
with a "type" field.

So, we applied for and got a protocol "type" field for QNX 4.


Packet Reception
----------------

When a packet arrives in from the ethernet, the network
driver looks at the "type" field, and if it's 0x8203, it's
a QNX 4 packet.  If it isn't, it must be for someone else.

This allows many different applictions, handling many different
protocols, to share the one ethernet network card being
serviced by the QNX 4 network driver.

For example, TCP/IP in QNX 4 uses the "raw packet" interface
to the Network Manager to transmit and receive IP and ARP
packets.

When "Socket" starts up, it sends a "register" message to 
the Network Manager, saying that it wants to receive all
packets with an ethernet type of 0x800 (IP).  Then, it sends
another message, saying that it wants to also handle packets
with a type of 0x806 (ARP).

Socket is actually handling two different ethernet protocols.

In this "register" message, Socket includes segments and 
offsets point to functions which Net can quickly "far call"
which perform the following things:

   1) allocate a buffer    (rx, step 1)
   2) buffer is now filled (rx, step 2)
   3) transmit complete    (tx)

This is pretty neat.  What Net does is create an alias of
Socket's code and data segments in his local descriptor
table (LDT).  For example, in Net's LDT, his code segment
might be x5, and his data segment might be xD.  And Socket's
code segment of x5 might be aliased into Net's LDT as 25, 
and Socket's data segment of xD might be aliased into Net's
LDT as x2D.

    Socket           Net           Net.ether1000
    ------          -------        -------------
    5 code ----+     5 code   +----  5  code
    D data --+ |     D data   | +--  D  data
             | |    15     <--+ |
             | |    1D     <----+
             | +->  25 
             +--->  2D

We can also see that the Network Driver, Net.ether1000
has similarly aliased his code and data segments into
Net's LDT.  Net drivers get far called into all the time,
so we want that interface to be very fast, with low overhead.

So.  When a packet arrives in from the ethernet, Net examines
the header and if the type is 0x800 (IP), Net then far calls
the function in Net's code segment of x25 to allocate a buffer
for this packet.

Socket's code then executes in Net's LDT as Net, to quickly
allocate a buffer, with a very minimum of overhead.

After Socket returns a pointer to the buffer, Net passes that
pointer down in his far call back into the network driver, who
is the only one who actually knows exactly how to talk to the
network card.  The network driver then copies the IP packet
directly from the card to the buffer provided by Socket.

This avoids a redundant copy of the packet.  Extra copies are
Evil and are to Be Avoided (tm).

Now that the buffer has been filled with the packet, Net far
calls into Socket again (via Net's code segment of x25) to
function #2 above, telling Socket that the buffer is now filled.

Socket is then free to put that buffer on a queue of received
packets.  See, until the second function is called, the packet
buffer allocated in the first far call into Socket is kind of
in limbo - Socket can't really peek into it, because the contents
of it are indeterminate, sort of like a schroedinger cat :)


Packet Transmission
-------------------

Transmission is actually simpler.  The "raw app", Socket
allocates a "queue packet" (a data structure which points
to a generic transmit request) and puts it on Net's input
queue, just as Proc or the kernel would.

When Net pulls Socket's queue packet off his input queue,
he sees that it is a "raw" transmit request, and simply
passes it down to the appropriate network driver.

The network driver copies the data directly from Socket's
buffer to the network card.  Again, no extra copies.

When the network driver completes transmitting the packet,
he gives it's queue packet back to Net, and Net puts it on
a queue of Socket's, and far calls Socket's function #3
which usually just returns a proxy for Net to trigger.

Triggering the proxy wakes up Socket, and Socket can look
at his packet queue of transmitted packet, and process
them appropriately.


