Microsoft networking protocols such as SMB, NetBIOS, and CIFS are used everywhere. Open source developers need to understand how they work in order for Linux to become a mainstream operating system.
SMB, the Server Message Block Protocol, is the most prevalent filesharing protocol on the planet for a very simple reason — it ships with every Microsoft Windows system and, like it or not, Windows still owns the desktop. Windows is also very common as a server platform in corporate networks. Not content with those markets, Windows is now finding its way into all sorts of new places, including embedded systems, palmtops, and consumer toys. As Windows moves onto new platforms, SMB does too.
Open source operating systems like Linux have been speaking SMB for quite a while now thanks to SAMBA, the well-known open source SMB Server suite. SAMBA, like Windows server products, is primarily a back-room tool. It runs on systems that are mounted in racks or stuck onto shelves in locked server rooms where only the geeks are brave enough to go. If Linux is going to move out of the datacenter and onto the corporate desktop (not to mention homes, hand-helds, cars, etc.), then Linux developers are going to need a working knowledge of SMB — the native language of the Microsoft Network Neighborhood.
In this article, we’ll look into SMB’s history and architecture, as well as how its components work together. You’ll also find a list of open source projects that aim to make it easier to add SMB support to Linux applications.
A Little History: NetBIOS
|Figure One: NetBios running over TCIP/IP.|
SMB was originally intended to run over a proprietary network system co-developed by IBM and a company called Sytec. In a moment of obvious inspiration, this system was dubbed “PC-Network.” It had no support for routing and could only handle a maximum of about 80 nodes. It was truly LAN-locked.
PC-Network was a broadband LAN product consisting of network cards, cables, and a small device driver known as NetBIOS (Network Basic Input/Output System). The original PC-Network hardware is long gone, having been replaced by Token Ring and then Ethernet. Unfortunately, lots and lots of software was written for use with the NetBIOS Application Programmer’s Interface (API), so, even though the PC Network hardware is no longer in use and the NetBIOS device driver is no longer needed, the NetBIOS API has remained as a living artifact.
Instead of moving away from NetBIOS and letting it die an honorable death, several vendors implemented the NetBIOS API on top of other protocols, including DECnet, IPX/ SPX, SNA, and TCP/IP. NetBIOS over TCP/IP is often called NBT and has become the preferred NetBIOS transport. The workings of NBT are described in two Internet Engineering Task Force (IETF) Request For Comment (RFC) documents, RFC1001 and RFC1002 (known collectively as Internet Standard #19). NBT is pictured in Figure One.
|Figure Two: SMB running over TCIP/IP.|
The SMB protocol was designed to run on a PC-Network LAN, using the NetBIOS API to send and receive packets. This did not change until the release of Windows 2000 (W2K), the first version of Windows to support SMB packet transport over TCP/IP without NetBIOS encapsulation. Even so, W2K includes NBT support to maintain compatibility with its predecessors. SMB over TCP/IP is shown in Figure Two.
SMB was originally developed by Intel and Microsoft in the early 1980s and has been the core of DOS and Windows filesharing ever since. Some time around 1996, as part of the buildup to W2K, Microsoft executed a Marketing Upgrade on SMB and renamed it CIFS, or Common Internet Filesystem.
CIFS enables the sharing of directories, files, printers, and other cool computer stuff across a network. To make use of these shared resources you need to be able to find and identify them; you also need to control access so that unauthorized users can’t fiddle about where they aren’t allowed. This means there is a hefty amount of administration to be managed, so CIFS filesharing comes with an entourage. There are protocols for service announcement, naming, authentication, and authorization. These are separate but intertwined. Some are based on published standards, others are not; most have changed over the years. These days, the term “CIFS” is most often used to refer to the full suite, while “SMB” is typically used when discussing the filesharing protocol itself.
In 1997, Microsoft submitted draft CIFS specifications to the Internet Engineering Task Force (IETF). Those drafts have since expired, but there is an effort underway by the Storage Network Industry Association (SNIA) to revive and overhaul them outside of the IETF process.
How It Works
Because of its heritage, the CIFS suite can be a bit awkward. Most of the silliness exists at the NetBIOS layer because, as we have already explained, NetBIOS is an anachronism.
NBT is an implementation of the NetBIOS API on top of TCP/IP, but what RFC 1001 & 1002 actually describe is a system for emulating NetBIOS-based PC-Network LANs over a routed IP inter-network. This is critical to understanding the workings of NBT — it is a virtual LAN system. The nodes in a CIFS filesharing network are connected to an imaginary wire by imaginary network adapters. It’s all make-believe.
There are three key parts to NBT. These are:
- the Name Service
- the Datagram Distribution Service
- the Session Service
The Name Service handles NetBIOS names (the addresses used on the emulated PC-Network LAN). The Datagram Distribution and Session Services carry data between the nodes on the virtual PC-Network LAN.
The NetBIOS Name Service
Each NetBIOS name is a communications end-point, representing an application or daemon that is waiting to hear from other applications or daemons across the virtual wire. The Name Service keeps track of which names are in use at which IP addresses, thus allowing the underlying IP network to find the nodes and transport NetBIOS messages between them. The Name Service runs on UDP port 137.
There are two kinds of names — group and unique. Group names can be shared so that datagrams can be multicast to all members of the group. In contrast, only one instance of a unique name may be registered at a time within a virtual PC-Network LAN.
The Name Service has two modes of operation — broadcast and point-to-point. In broadcast mode, names are registered, queried, and eventually released by sending UDP broadcasts to port 137. It’s sort of like calling out, “Yo! Anybody here named RUGRAT?” in a crowded room. If there is a RUGRAT in the room, you would expect an answer like, “Yeah, here I am.”
Point-to-point mode is used to cross IP subnet boundaries. Since IP broadcasts are typically limited to local IP subnets, a special server called the NetBIOS Name Server (NBNS) must be used to coordinate and manage the names in use on a given NBT virtual LAN. All registrations, queries, and releases are sent directly to the NBNS, which keeps the name-to-IP mappings in a database. Microsoft’s NBNS implementation is called WINS (Windows Internet Name Service). The term WINS is now commonly used instead of NBNS, but we will be pedantic and stick with the latter.
It is possible, and even common, to combine broadcast and point-to-point name management. The RFCs describe “Mixed mode,” and Microsoft later added “Hybrid mode.” These two modes differ only in the order in which the broadcast and point-to-point mechanisms are applied.
The following is a list of acronyms relevant to the Microsoft Networking Protocols:
|CIFS||Common Internet File System|
|Domain Master Browser|
|Domain Naming Service|
|Internetwork Packet Exchange/ Sequenced Packet Exchange|
|Local Master Browser|
|NetBIOS Datagram Distribution Server|
|NetBIOS Name Server|
|NetBIOS over TCP/IP|
|Network Basic Input/Output System|
|Server Message Block Protocol|
|Systems Network Architecture|
|Transmission Control Protocol/Internet Protocol|
|User Datagram Protocol|
|Windows Internet Name Service|
The NetBIOS Datagram and Session Services
Data transport is handled either by the Datagram Distribution Service or the Session Service, depending upon the needs and design of the application. In the IP world, TCP provides connection-oriented sessions in which packets are acknowledged, put in order, and retransmitted if lost. This creates the illusion of a continuous, sequential data stream from one end to the other. In contrast, UDP datagrams are simply sent. UDP requires less overhead, but it is also considerably less reliable than TCP.
NetBIOS also provides connection-oriented (session) and connectionless (datagram) communications. Naturally, NBT uses TCP to carry NetBIOS sessions and UDP to carry NetBIOS datagrams. These services run on 139/TCP and 138/ UDP, respectively.
The Datagram Service
Sending a datagram to a unique name is fairly simple. The name is resolved to an IP address via the Name Service, and the NetBIOS message is tucked into a UDP packet that is sent to port 138. That’s it.
Sending a multicast (group name) datagram is also fairly simple if broadcast mode name management is in use. In this case, group datagrams can be sent to the IP broadcast address instead of a unicast address. All local nodes will see the packet, but only group members will actually open it. It’s not too tough.
If the NBT virtual LAN crosses IP subnet boundaries, however, sending NetBIOS datagrams to a group name gets a bit icky. Per the RFCs, the same system that is running the NBNS also runs a service called the NetBIOS Datagram Distribution Server (NBDD); multicast datagrams are sent to the NBDD, which gets the list of IPs associated with the group name from the NBNS; the NBDD then forwards the datagram individually to each group member. It’s sort of like sending a group e-mail to a mailing list server. You send one message, and the server takes care of distributing copies to all of the recipients.
The problem with the datagram service is that Microsoft messed it up. They made a mistake when they implemented WINS. With the exception of one special case, WINS fails to keep track of IPs associated with a group name. Instead, WINS stores only the generic broadcast address 255. 255.255.255. Because of this, Microsoft never bothered to implement the NBDD. The upshot is that some group members will not receive group multicasts, which has implications for services that rely on group names. We will see an example of this later on when we examine the Browser Service.
The sad truth is that SAMBA, in an effort to remain compatible, followed Microsoft’s example.
The NetBIOS Session Service
Under NBT, NetBIOS sessions are created on top of TCP sessions. Here’s what happens when node FRED tries to establish a NetBIOS session with node ETHEL:
FRED uses the Name Service to find the IP address of node ETHEL.
FRED establishes a TCP connection to TCP port 139 on node ETHEL.
FRED sends an NBT SESSION SERVICE REQUEST packet via the TCP connection. The request contains the NetBIOS name of the source node (FRED) and the NetBIOS name of the target node (ETHEL).
The SESSION SERVICE REQUEST can be rejected if ETHEL isn’t home (that is, the software that registered ETHEL is not actually listening or the name was never registered at all). If the request is accepted, the two systems may send NetBIOS session packets via the TCP tunnel until the connection is closed.
The Session Service is the simplest of the three NBT services. It does not need to worry about distributing messages to all owners of a group name since it is inherently a point-to-point service. It is, however, the transport for SMB filesharing, so it is of particular interest to us.
The Sum of the Parts
The purpose of NBT is to provide an emulated PC-Network LAN. It does not matter if the participating nodes are scattered across the Internet. If they share the same NetBIOS name space, they are on the same virtual wire. It is the Name Service that is responsible for creating and maintaining the name space, so the Name Service defines the virtual LAN.
The NetBIOS API is the gateway to that virtual LAN, but non-Windows systems generally avoid using that interface. Instead, they typically craft the NBT packets and handle TCP and UDP transfers directly. This, unfortunately, can give the impression that SMB and its associated services are all IP-based, which really isn’t the case. Remember that the NetBIOS API has been implemented on top of lots of other protocols too.
Now that we have jumped through the flaming NBT hoops, it’s time to juggle the SMB chainsaw.
The first thing to note about Server Message Block is that SMB packets use Intel little-endian byte order, while NBT uses big-endian network byte order. No matter how you fiddle with it, if you are going to implement SMB, you are going to have to swap a few bytes.
The Server Message Block is a record structure. The first field always contains the identifying characters ’0xFF’ ‘S’ ‘M’ ‘B’, just to make it absolutely clear what you are dealing with. The second field is the command. SMB messages are made up of a command, the data associated with the command, and the context in which the command is to execute. The context information allows SMB to keep track of multiple links multiplexed within a single NetBIOS session.
Most of the SMB commands are derived directly from DOS I/O functions. They include basic stuff like OPEN, CLOSE, and DELETE, plus commands for handling print jobs and a few other oddities. Before these can be used, however, a client must gain access to a shared printer or directory (share). However, the SMB protocol has undergone a bit of evolution since it was first introduced, and this has resulted in a number of “dialects.” To accommodate the various SMB dialects, there is a NEGOTIATE SMB that lets nodes discuss and agree upon an SMB dialect to use.
Documentation on SMB can be found on Microsoft’s FTP servers. Just dig around in ftp://ftp.microsoft.com/developr/ drg/cifs/ for a while if you are curious. Start with the older stuff and work your way forward.
Presentation Is Everything
|Figure Three: Servers available on the network.|
|Figure Four: File shares available on the Scred server.|
Now it’s time to put a pretty face on all of this. On any Windows desktop, you will likely find an icon labeled “Network Neighborhood.” This is the front door to the CIFS (and SMB) world. Double-click the icon and you should see something resembling Figure Three.
The icons in the window represent servers available on the network. (Your network, of course, will have different servers listed.) Double-click a server icon and you should see a list of the shares offered by that server. Figure Four contains such a list.
Sure Looks Pretty, Doesn’t it?
The underlying system that makes this presentation possible is called the “Browser Service.” This service collects and maintains the “Browse List,” and viewing the Browse List (e.g., via the Network Neighborhood) is called “Browsing.” It should be noted that Microsoft came up with these names before the invention of the Web Browser, so they cannot be blamed for any ensuing confusion.
Browsing is organized in terms of IP subnets and Workgroups. A “Workgroup” is a set of NBT nodes on an IP subnet that shares the same Workgroup name. In our examples, all of the nodes are members of the UBIQX workgroup.
On each subnet, the Workgroup members hold an “election,” which involves sending group datagrams via the NBT Datagram Service. The election mechanism makes Florida recounts look easy, so we will save the description for another day. Eventually, a winner is declared and designated as the local “Master Browser” (LMB) for the Workgroup. If there are a lot of nodes in the Workgroup, additional local Browsers may be elected to serve as “Backup Browsers.”
When a client wishes to see the Browse List, it asks one of the Browsers on the local LAN for a copy; this is what is displayed when you double-click Network Neighborhood.
As described earlier, the lack of a working NBDD in Microsoft’s implementation of NBT limited browsing to IP subnets. Microsoft recognized the need to circulate Browse Lists outside of IP subnets, so they created yet another new server called the “Domain Master Browser” (DMB). The DMB registers its name with the WINS server. All of the local Master Browsers look for this name and will send updates to the DMB, which then combines the lists and hands them back. The DMB is a work-around for the missing NBDD, essentially allowing browsing to cross subnet boundaries.
NBT can be a pain in the neck, and old mistakes still haunt modern implementations. What is a multi-billion dollar corporation to do?
As mentioned earlier, Microsoft introduced SMB without NBT in Windows 2000 and calls it CIFS. In CIFS, the SMB packets run native over TCP without the need for NetBIOS framing. Not only has NetBIOS been removed, but all of the supporting systems (like name resolution, browsing, and even authentication) have been replaced with standards-based services. WINS, for example, has been replaced by Dynamic DNS, and Kerberos is now used for authentication. At the core of all this is the Active Directory, which (like Novell’s NDS) is based on X.500. Unfortunately, Microsoft seems to have added their own spin to these services, and several sources are complaining about incompatibilities.
SAMBA can work with Windows 2000 systems, as long as the latter are running in NBT-compatibility mode. Even so, a number of problems have been reported and, hopefully, accommodated. SAMBA lives by adjusting itself to the quirks of each new Microsoft OS.
The Linux community has only begun to dig into Windows 2000. It is an enormous animal that will take some time to dissect. Meanwhile, Microsoft is already working on their next big products. We can only wait to see what changes will come with those.
CIFS and Linux
As we mentioned at the beginning of this article, SMB and CIFS are here to stay. For this reason, Linux must put itself on an equal playing field with Microsoft products if it is to be accepted as a mainstream operating system.
Existing Linux client tools are command-line based, designed for use by folks who already know how this stuff works. Fortunately, efforts are underway to develop CIFS client libraries and other tools aimed at making CIFS easy to use from Linux. These tools will be incorporated into popular desktop systems (such as KDE) so that Linux users can also browse CIFS shares from the desktop…just like Windows users can.
While Linux is working to catch up, Windows has already moved beyond the desktop and into palmtops, settops, and other markets. They have brought CIFS with them. Though it fits well in the back room with the big iron, SAMBA is too big for these environments. Linux needs tiny CIFS clients for the embedded market, simple servers for small-end network appliances, and graphical tools to help it on the desktop.
In the Open Source Projects CIFS sidebar you will find information about some of the Open Source projects currently aiming to fully leverage CIFS for Linux and other operating systems. Get involved and get to know what these projects offer. These are the foundation of the future success of Linux in the Network Neighborhood.
Open Source CIFS Projects
SAMBA (www.samba.org) is the best-known and most popular Open Source implementation of SMB/ CIFS, but there are other projects aimed at leveraging these protocols. This is a partial listing:
SMBFS for Linux
Originated by Pal-Kristian Engstad, SMBFS allows a Linux system to mount SMB shares (shared directories). SMBFS is officially part of the Linux kernel. The project is now maintained by Urban Widmark after changing hands several times.
SAMBA Appliance Branch
SAMBA Appliance Branch is a slightly stripped-down version of SAMBA intended for Server Appliance devices (headless computers, rack-mounted in server rooms, requiring minimal management). Dave O’Neill maintains this code branch.
Caldera asked Richard Sharpe of the SAMBA Team to produce an SMB client library for Linux/Unix. When finished, the library will make it easy to add CIFS client capabilities to applications such as KDE’s Konqueror. The library is being derived from SAMBA source.
Alain Barbet has developed Perl modules that interface to SAMBA command-line utilities and to Richard Sharpe’s client library.
Yours Truly is desperately trying to document the workings of NBT, SMB, and CIFS to make it easier to implement.
jCIFS is a set of Java classes that implement SMB/CIFS protocols and is aimed at the development of client applications. Michael Allen heads the coding effort.
Luke Kenneth Casson Leighton, eager to explore the depths of Microsoft’s Remote Procedure Call (MS-RPC) system, created this SAMBA spin-off project with the help of several folks from the SAMBA-Technical mailing list.
libssmb++ is an SMB client library written in C++. Nicolas Brodu started the project, but is not able to pursue it further. This project is looking for a new owner.
Based on Linux SMBFS, Sharity-Light runs in user-space instead of as part of the kernel. Note that Sharity-Light is Open Source, but Sharity (no Light) is a commercial product. (Does that make it Sharity Dark?) Both are from a company called Objective Development.
SAMBA and SMBFS for Amiga
It’s not dead yet! Olaf Barthel has ported both SAMBA and Sharity-Light to the Amiga platform.
There are probably more. The Open Source world is a dynamic place with new stuff popping up all the time. Andrew Tridgell, originator of SAMBA, believes in competition within the community and has encouraged more than one of the above projects. Simply put, he believes the more people who know how different projects work, the better.
In a speech at the LinuxWorld Expo in New York, Tridgell said, “Let’s just get rid of these horrible protocols.” His hope — one shared by many SMB/CIFS developers — is that people will eventually understand how terrible CIFS is, and they will make it go away.
Chris Hertel is a member of the SAMBA Team and a founding member of the jCIFS Team. He can be reached at firstname.lastname@example.org.