Philips Semiconductors Programming 12C Specific information the 12C Interface EMBEDDED SYSTEMS When intedigentdevicesneedtocommunicate MitchellKahn T he Inter-Integrated Circuit Bus ("1% Bus" for short) IS a twowire, synchronous, serial interface designed pnmanly for commumcation between intelligent IC devices. The 1% bus offers several advantages over ,.traditlonal" serial interfaces such as LMicrowlre and RS-232. Among the advanced features of 1% are multimaster operauon, automatic baud-rate adjustment, and .`plugJnd-play ' network extensions. ,Mention the 1% bus to a group of Amencan engmeers and you'll likely get hit with an abundance of blank stares. I say Amencan engmeers because untll recently the 12C bus was pnmanly a European phenomenon. Withm the last Interest m I`C in the year, however, Unite ,d States has risen dramatIcally. Embe tdded systems designers are realzmg the cost. soace. and Dower savmgs-afforded by'robust se&l mterchip protocols. The idea of serial Interconnect between integrated clrcults IS not new. Many semconductor vendors offer devtces desIgned to ?alk" via serial links with other processor. Current exampies include Microwire (Xatlonai Semconductor), SPI (~Motorolaj, and most recently Echelon's INeuron chps. In all cases, the goal is the same: to reduce the wiring and pmcount necessary for a parallel data bus. It simply does not make Mitch 1s a senior strategzc development enguzeer for Intel and can be contacted at 5000 W Chandler Blud.. Cbandler AZ 85226 or at mkabn@sedona. zntel.com group. I took excepuon (although not verballv!) to his suggestIon. .A weekend of intense hacking later, I presented the first prototype of the driver. My reward? 1got to write a generic version of the dnver for general distnbutlon. economic sense to route a full-speed parallel bus to a slow penpherai. Unfortunately for most serlal-buscapable devices, the choice of a bus protocol wdl dictate the CPU architecture. For example, onlv two CPU archltectures implement an on-chip 1% port. If your choice of architecture precludes use of these architectures, then your only opuon IS to tmpiement the protocol in software. The sobare lmpiementatlon of the 1% protocol discussed m this article came about as a result of an ~mphclt challenge during a staff meetmg. One of our managers proposed that we hire a consultant to write a software 1% driver tar the Intel 8OCl86EB embedded processor. Being somewhat new to the 69 Design Tmde-offs Three distmct tasks are mvolved m implementmg the 12C protocol: watching the bus, waltmg for a spectiic amount of time, and driving the bus. This became apparent when I flowcharted 1 byte of a typical bus transaction: see Figure 1. The time delays associated with creatmg the bus waveforms would normally have been relegated to the 8OCl86EB's on-chip timers. I could not, however. assume that the end users of my code would be able to spare a urner for the software 12C prt. I had to forego the elegance (and to some extent accuracvj of the on-chip timers for the sledgehammer approach of software timing loops. Luchlv, the IlC protocol IS extremely forglvmg with regard to ummg accuracv. The declslon to use assembly tnstead of a high-level language stemmed drectly from the need to control program-execunon time. I had ne;ther the tme nor the m&anon to handtune high-level code. Havmg made the decision to use assembly language, I faced mv neaxt problem: Could I make the code portable? Intel offers a plethora ot CPU and embedded-controller architectures. Would it be possible to make the code somewhat portable between disparate assembiv languages? I found mv answer m the use of macros. Dr Dobb j-journal. _fune 1992 Philips Semiconductors Programming the i*C Interface l*C Specific information AH the basic bu1ldmg blocks of thj I? protocol (watching, welting, and dc In9 can be compartmentalized into dk tmct macros. The algorithms that makl up the IlC driver are written with thesg macros as the fmmework. You don need co understand the mtncac1es of thg I? protocol to port these routinesyou just need to know how to makf your CPU watch, Walt. and do. For example, a 4,7_uS delay IS a corn mon event during a transfer. The macn %Wazt_4_ i_u.S implements lust such delay by usmg the 8086 LOOP instruc t1on with a couple of NOPs for tunmg see Example l(a). Total execuuon t1mf LSreadily calculated from mstrucuon m ing tables. The same macro IS ported t( the 1960 architecture 1n Example l(b: Although I am a neophyte when I I I I Drive 1 Assert SCl_ Low I, Nth Data 51t on SDA +7 Wad 2.35 pS I. comes to i960 programng, I hJd no problems portmg the core macros. Hardware Dependencies A few word5 about the target hardware are 1n order before I discuss the code. Any 1mplementatlon of the I?C protocol requires two open-drain (or open-collector), bidirectional port pins for the Serial Clock (SCLJ and Serial Data C SDA) lines. The code 1n this article was des1gned for the 8OCl86EB embedded pr@ cessor, which has two open-dram ports on-chip. The two pms, ~2.6 ECL) and P2.7 (SDA), are part of a larger &bit port. Processon wIthout opendram I/O ports can easdy implement I'C wKh the addition of an eZxtemal open-collector Iacch. Two spec1aMuncuon registers, P2PIN and P2LTCH. are used to read and wnce the state of the port pms. The 8OCl86EB allows the spec1aMunct:on registers to be located anywhere m either memory or I/O space. For this 1mplementanon, I chose to leave the registers 1n I/O space, even though this iimted my choice of instructions. The 80186 architecture does not provide for readmodify-wnte instructIons 1n I/O space (an AND to I/O, for example): 1t can only load and store (IN and OUT?. So why did I iirmt myself? Agam, I had to assume the lowest common denomlnator for our customers when design1ng my code. , BuildingtheFmmewark Early on 1n development. I decided to parution my code macros according to physical processes mvolved in the 12C protocol. Code not d1rectiy mvoived m lm&ng the acnons of a hardware 1% port was not wntten as macros. For example. the code necessarv to access the stack frame IS not written as a macro, whereas the code needed to toggle the clock line is. Thus was done to Isolate Architecturedependent code sequences from the more generic 1% funcuons. lMacros were also not used for `.gray areas" such as the shtimg of serial data. which IS both architecture dependent and physical m nature. The 1% functions that passed the litmus test fell mto the three aforementioned categories of watchmg, waiting, and domg. The .walting" macros provide a fixedmm~mum time delay They are impiemented usmg a simple LOOP $ delay. The LOOP instructIon decrements the CX register, then branches to the target (in thus case 1t.self3tf the result 1s nonzero. The delay 1s (n-1)*15+5 clocks, where n 1s the starting value in the CX register. All the delays were calculated assuming a ~~-MHZ clock rate (62.5 nanoseconds per clock). The code still works at lower CPU speeds because the 1% protocol only specifies mmimum tlmmgs. In fact, the delay macros are only ~`accurate enough." providing tlmmgs as close as I could get to the specified minimum without undue tunmg. The "watching" macros are `spin-onbit" polling loops. These pieces of code wait for a transItion on the appropnate 12Cline to occur before allowing execut1on to contmue. There are two polling macros for each of the two I'C sIgna lines; one for high-to-low transitions and one for low-to-high transitions. The %*DEFINE(Wait_4_i'_uS)i IIWV cx, loop $ 5 nOP ; 4 clocks ; 4*15+5 ; 3 = 65 clocks clocks ; 3 clocks ; total = 75 noP ; 15 * 62.5ns clocks = 4.69uS (close enough) 0-4 No defme(Waltz_4_7_uS,' Ob: lda 0x17, cmpdeco bne.t 0, r4 Ob r4 # # # I # instruction nay be Issued in parallel so assume no clocks. compare and decrement counter in r4 if !=O branch back (predict taken brancn) # # # # # ?he cmpdeco and bne.t Cogecher clocks In parallel Cnmum. Example 1: cu) 8OCl8G tmplementatzon implementation of 4 7-1~5 wazt macro 70 take 3 0x17 (25 decxml) l 3 = 75 clocks at 16MHz :h~s 1s 4.69uS oj.4 7_US utazt macro: (b) 8096OCA Llr Dobbs Journals June 1992 Philips Semiconductors 12C Specific information polling of the SCL line that gives rise I an importanc feature of 12C: automatit bit-by-bit baud-rate adjustment. Any df vice on the IzC bus may hoid the clot line low m order to stall the bus fc more rime (a serial wait state). The otl er devices on the bus are then force to poll the SCL line uric11 the slow d< vice releases control of the clock. The %Get_SDA_&t macro also fal under the category of "watching." 11 function is simply to return the state c the SDA line without waiting for a trar sition. %Get_SDA_Bit LS used pnmanl to pull the serial data off the bus whe the clock is valid. The .`domg" macros control the stat of the clock and data lines. As with th polling macros. there are four typesone for each transition of the XL c SDA lines. The .`domg ' macros ar named to reflect the physical operation they perform. For example, %Drzve SCL_Low always drives the SCL line t a low state. %Relea.se_SCL_Higb,on th other hand, relinquishes control of th SCL line, which may then be pulIed hig or driven low by another device on th bus. A read-modify-write operation I usedfor the bit manipulation so mat th other 6 bits of Port 2 are not affecte by the 12C operations. Getting on the Bus Three procedures were created usm the macro framework. I'll describe or ly the master transmit (Listing One, pag I stop I Figure 2: Flowcba??jx- PC transmrt procedure. Programming the 12CInterface 106~ and master receive functions (Listing Two, page 1081, as they represent the needs of most 12C users. The slave procedure is long and intricate and will not be described here. An 12C master transmission proceeds a.5follows: I. The master polls the bus to see if it is in use. 2. The master generates a start condition on the bus. 3. The master broadcasts the slave address and expects an acknowledge (ACK) from the addressed slave. 4. The master transrmts 0 or more bytes of data, expecting an ACK following each byte. 5. The master generates a stop condition and releases the bus. The stack frame for the master transmit procedure, I~CXA.A~~, includes a far pointer to the message for transrmssion, the byte count for the message, and the slave address. Far pointers and far procedure calls are used in all the procedures. No attempt was made to conform to a specific high-level language caUing convention, although such a conversion would be trivial. The procedures save only the state of the mcxdshed segment registers. The master transmit procedure performs error checking on the passed parameters before attempting to send the message. The maximum message length is set at 64 Kbytes by the segmentation of the 80186 memory space. This restrictton could be removed by mciuding code to handle segment boundanes. The transmit procedure also checks the direction bit tn the slave address to ensure that a reception was not erroneously indicated. Errors are reported back to the calling procedure through the AK regrster. (The exact code is in Listing One.) The first step in sending a message is getting on the 12C bus. The macro oM~ec~_~or_Br&+-ee simply polls the bus to determme if any transactions are in progress. If so, the transmit procedure aborts with the appropriate error code. If the bus is free, a start condition IS generated, The start condition is defined as a high-to-low transition of SDA with SCL high followed by a 4.7_uS pause. These waveforms are easily generated with the %Dnve_SDA_Low and % WaU_4_ 7-d macros. All communication on the Ik bus between the stop and start condiuons, including addresstng and data, takes place as an &bit data value followed by an acknowledge bit. This lead to the natural nested loop structure for the body of the procedure: see Figure 2. 71 The inner loop IS responsible for transmitung the 8 bits of each data byte. Each transmitted bit generates the appropriate data (SDA) and clock (SCL) waveforms whiIe checking for both serial wait states and potential bus collisions. A bus collision occurs when two masters attempt to gain control of the lh-ee distincttasks are involved in implementingthe watching the bus, waitingfora specijiic amount of time, and driving the bus bus simultaneously. The 12C protocol handles collisions with the simple rule: "He who transmits the first 0 on the SDA line wins the bus." To ensure that we (the master transmit procedure) own the bus, the SDA line is checked whenever transmitting a 1. If a 0 is present, then a collision has occurred (because another master is pulling the line low), and the transfer must be aborted. Control is turned over to the outer loop after the 8 bits of data (or address) have been transmitted. The outer loop immediately checks for an acknowledge from the addressed slave. The transfer is aborted if an acknowledge is not received. At the end of the ACK bit the message length counter IS decremented. Control is returned to the inner loop if more data remains, otherwrse a stop condition is generated and the master mu-isrmt procedure terminates. Registers are used for tntermediate result storage throughout the body of the procedure. For example, the AH register is used to hold the current value (either address or data) bemg shifted onto the SDA line. This elimmates the need for local data storage within the procedure. On the Receiving End The steps involved in an I'C master receive transaction are almost identical to those in transmission: 1. The master the bus to see tf it is in use. 2. The master generates a start condi- L3 Dobbk Journal June 1992 Philips Semiconductors Programming the 12C Interface 12C Specific information non on the bu5 j The ma5ter broxica5ts the 5iave address md aspects XI `\CK from th .iddre55ed slave + The master recene5 0 or more bvte ot dam and \ends lav Jtter each Me. The ma5ter signal the bdst bake bv not 5endmg an ,\Ch b. 5 The ma5ter generace J \cop cond Inon &md reiea5e5 the bus I &Ion hss heen generated. 5ee Fqglre 3 The 5b~ve -1ddresa IS tran5mltted u5mg one iteration ot the rran5mlt procedure5 outer loop Control 15 pa\5ed to the receive loop once the 5lave dcknowledges Its addre5s The receive Imp 5tructure 15 patterned &er that of the trdnsmlt procedure. The Inner loop controls the Ltockmg of the SCL line and the 5hlhq ot rhe 5txxdl dota otf the SDA lme Into the CPU Eight LteKlhons ot the mer loop are performed ro receive each bvte The outer loop >tores the received bate II-I the buffer. decrements the byte count, then sencL> an XK to the slave. The la5t data byte IS sqgaited by not 5endq an ACK .I far pointer to the receive butfer I51 the stack to the ma5ter rc cetve procedure. The remainder ot th pammeter5--slave addres5 and me> sage count-xe ldentlcai between th ~0 procedures The received me55ag length 15 fiixed At 6+ Kbvtes. +qun bt Using the Procedures c3use of 5egmentation The error-chech Ll5tmg Three (page 110) \hows J short mg, bus-a~~aMxi~tv >ensmg, Jnd \tan progrxn that u5e5 both the ma5ter trxxcondltlon generJtlon sectIons ot th rnlt and tnaster receive procedure5 The receive procedure xe lifted verbxlr cxil to procedure IX_XMT dlspiavs tie from the tran5mlt code lvord `WS-" on a four-character. 5evThe structure of the receive proce en-segment dl5piav controlled bv the dure differs 5hghtly once the start con %A1064 1% compatible dispiav driver. The time of dav IS read tram the Start fYFS%3 real-time clock by the cait to Byte = l*C address, procedure IX_RECV Please note that Interrupts must be di5abled durtng the execution ot both procedures. .&n mterruptlon at an moppomme time cwhen the ma5ter IS not In control of the clock) could cause the bus to hxq It you need to service mterrupts perIodically, then enable them only when the clock 15 driven low. These procedure5 have heen rested on a wide armv ot 1% device5 mngmg trom senat EEPROMs to vow svntheslzer5. .So compatlbllitv problems have heen seen to date. pasxcion I for example, replxx the tmimg loop5 \vlth umed uxemipt5 That wry, the CPL' could perform useful work durmg the pxi5es .&iong the 5ame ime5. the pauses could be scheduled using a realtime kernel, agam improvIng CPti throughput. Flnaltv, vou could add a hgh-level language calling 5tructure. The u5e of tImed mtemlpts Jdd5 an order of magnitude to the complextv of the code, but n,ould be worth It for high-pertormance. real-time 5ystems. Conclusion 1% IS not the onlv game m town when In comes to serial protocols. HopeMy, some ot the technques presented here wil carry over Into the development of other ,xmuiated" senai protocols, such JS those targeted at the home-automdtlon market. Who knows, maybe somedav J snippet of mv code mdv find 1t.5 way Into a trulv Intelligent dishwasher. I'll be waltmg References Z2C Bus SpeczJic~~twz. PhIlips Corporation cundated) Reprinted with permission of Dr. Dobbk Journal, 1992 Entwe ccmfeflts copyright unless * 1992 by M6T Pubitsh#ng, IX otnelwlae notao o" 3peclflC artIclea. All rights resected. Enhancing the Code l'w kIcked around hxxxng No ACK Send ACK 1 Byte ++ N=l 93 stop Figure 3: Flowchar? Jor IX procedure receufe many &25 for enthe 1% procedures. `t'ou could. All the basic building block oj* the 12Cprotocol (watching, waiting, and doin& can be compatimentalized into distinct macros Philips Semiconductors aNorth American SurtnVvale, 72 PhIlIps Company 811 E. Arquas Avenue P 0. Box 3409 Callfornta 94088-3409 l*C Specific information Exploring l*C erial data buses are a wellproven tool in embedded systems. When you are communicating with slow peripheral devices, serial buses are often often more convenient and less expensive than parallel buses. Additionally, a serial interface featuring a UART or similar intermediary chip can also serve to isolate the CPU from noise and line glitches that might bring down the house if they were to occur on the prczessor bus. Peripherals can usually be controlled over a much greater distance by a serial bus. The serial approach offers greater resilience and noise immunity. The price you pay for the convenience is a slower transmission rate and, possibly, the need for added interface circuitry at higher voltages. Many pripheral devices, however, are not in constant communication with the CPU and are not greatly affected by a slower bus. On the hardware side, any added interface circuitry required for serial-bus support is frequently compensated for by the resulting simplicity and tighter pinout of the serial peripherals. S CHOOSlN6THE PROPERROUll aving decided that a serial bus makes sense for your application, your next task is to select the most appropriate bus and protocol. Here, as with rapid transit, your choice should be determined by your destination. Contrary to what some people may tell you, the choice of bus and protocol depends at least as much on the nature of the system's software as it does on the manufacturer's data sheets. Consider, for example, the serial-peripheral interface (SPI) and multidrop H 73 The choice of bus andprotocol dependsat least as much on the system's software as it does on the manufacturer's data sheets. serial buses. Both buses are popular, but each exhibits severly constrained performance in large networks. SPI, as embodied in the Motorola 6800 family, was designed primarily for one-on-one exchanges between two devices. Similarly, the multidrop approach used in various 805 I family members as well as in the 68HC 11 and various UART chips hnds its broadest expression in RS485/422 halfduplex transmissions. Multidrop has no deterministic arbitration scheme between multiple masters, leaving it mainly suitable for singlemaster multiple-slave situations. (For more on multidrop, see Jack Woehr's article, "Multidrop Processing, " Embedded Systems Programming, March 1990, pp 58-67-ed.) A different approach is to use a three-wire protocol called MicroWire, available from National Semiconductor in Santa Clara. Calif., which is fine for use with addressable peripherals, but requires an individual chip select for each device ad- lips Semiconductors l*C Specific information Exploring l*C Exploriizg 2 IC up a multtple-master. multiple-slave commumcattons bus wtth conflict arbitration. usmg only twtsted-patr wirmg to connect the processors and peripherals. Philips/Signettcs has moved tosupport this protocol (which is quote popular in Europe) with a large assortment of interesting doodads. and is actively dressed. The added wiring offers no advantage to developers, and the bus of- fers nothing towards achievmg multipie-mastermg capabilities. One of the more versattle options available to developers IS the PC bus promulgated by Philips/Signetics m Sunnyvale. Calif. IT allows you to set Open-collector configuration means that the output stage can only pull the node Figure 1 Generation of acknowledge. to ground. --- ---v- _.- encouragmg other manufacturers to jotn m the fun. If your next design features a mrcroprocessor that supports PC or you are prepared to implement PC in software using a PIA as this article illustrates. your reward could be a decreased chip count and lower power consumption-along wtth a comfortable distributed-programming model for peripheral devices. PC is more flexible than the protocols noted above. since only two wires are required to service a large network of addressable masters and addressable slaves. A third wire may be added if interrupt service is required. though Philips/Signetrcs mrcroprocessors featuring PC support feature on-chip circuttry and are capable of interruptmg the processor upon receipt of a valid address. - B - - - - HOW 1% WORKS T he PC bus consists of two lines: serial clock (SCL) and serial data (SDA). The beauty of the PC bus is that each of these lines is bidirectional. Bidirectional means that everything on the bus 1s equal, unlike most other serial-peripheral busses such as SPI or MicroWire, which have dedicated inputs and outputs. Each PC transaction line (SCL and SDA) is an open collector of output and input. The 74 Philips Semiconductors l*C Specific information pullup reststor is external. Open-collector (actually, they are CMOS. so &open dram" is more appropriate) contiguratron means that the output stage can only pull the node to ground. A passive resistor pulls the node high, which means that any number of open collector outputs can be connected together with no deiiterious results. because tt IS impossible to pull more current through the reststor than any one output will produce. Tying outputs together will produce disastrous results if the same procedure is tried with standard TTL outputs. If some of the outputs go high and some are low. the current IS unlimited and the logic level of the output ~111be in an indetermmate state. Tying open-collector outputs together is also known as "wtre ORmg"because if either A or B goes low, so does the single-output line. The PC bus speed is specified at a maximum SCL rate of 1OOkHz SCL, which, admittedly, is not blazingly fast. The speed limit stems from the meager ability of a pullup resistor to source current to a long distributed line of pertpherals. The lo-microsecond period allows plenty of time to charge the parasitic capacitance of the wires. (The maximum spectfied wtre capacttance ts 400 pF.1 Exploring l*C ed the IX bus using a couple of the pins on an 8255 peripheral L/O chtp. Consequently, the bufk of the example appiication code 1ssimple setup and housekeeping routines. (Steven R. Wheeler's example appiicatlon iistmg was a bit too long to run in this issue. interested readers may download It ,fi-om the librar,v 12 of CLMFORUM on CompuServe or j-om the Embedded Systems Programmmg bulietm board service at (415) 905-2689-ed.1 By definition. a slave can be any processor or peripheral that responds to the master. Slaves all have unique, 7-bit addresses that are based on the device type and the wiring of address pins on the chip. All I*C peripherals have the top nibble of an address built in. For the PCFgj74 I/O-port expanders we're us- Figure 2 Start and stop conditions. SDA SCL Start stop PUITING IT TOGETHER Ithough FC supports multiple-master operation, here we use single-master, single-slave transactions to keep the example code simple. The master. as you might imagme, IS defined as the unit that initiates the data transfer and generates the SCL srgnal. (In a multimaster system, each master would be responstble for generating its own SCL signal.) In our exampIe. based strongly on the destgn of one of our company's smgle-board computers, the processor doesn't directly support I'C. Instead, we've implement- A 75 Philips Semiconductors l*C Specific information Exploring l*C Expbriig 2 IC ing as examples, the address is OlOOxxx. The xxx indicates the address selected by the state of the three address pins on the peripheral. PC serial transactions are always eight bits of data from the transmitter followed by a ninth ACK bit from the receiver. The first step in any PC data transfer is to send the address of the slave on the SDA line. This act might seem confusing, since we seem to be mixing 7-bit addresses with g-bit data. In practice, it's quite easy to work with: addresses are always seven bits long, and the eighth bit is used to determine whether the operation is a read or a write. For example, upon transmitting 01ooo0O1 to the PCF8574, the slave, assuming it exists on the bus and is strapped to address Ooo, will respond with a low on the SDA line after the master has finished with its last (eighth) data bit. The master leaves the line high. If it doesn't find a slave with address 10000, the data line ~111remain high and a failed communication attempt can be detected. If a slave is connected, it begins putting data on the SDA line as soon as it has detected that the eighth bit is set (which is a read request). The SDA line is driven to the data level when the SCL line is low. Data is read when SCL is high, so SDA must not change when SCL is high. This protocol leads to a simple definition of the start of an PC transaction-SDA goes from high to low when the clock is high. The end of a transaction is equally simple to detect: SDA goes from low to high when SCL is high. This cycle leaves SDA and SCL in the high state, which is necessary if any other opencollector PC peripheral wants access to the bus. Figure 2 illustrates the start and stop conditions of an FC bus transaction. s ADDITIDNAL DESIGN RDDTES you've seen, the PC protocol is easy to work with and relatively simple to implement, even if you're not using a processor that directly implements it. If you're not planning to use Philips/Signetics microprocessors with onboard PC support (such as the 68070 or various members of the 805 1 family), you can still use the wide variety of available peripheral chips. The number of integrated circuits using the PC serial bus is increasing all the time. Application-oriented integrated circuits that support PC include a voice sythesizer, a transc&er for IR remote control, several digital tuning circuits for computer-controlled television, several audio processors, PLL frequency synthesizers, tone generators, and frequency synthesizers. General- A purpose integrated circuits using PC include LCD drivers, digital-toanalog converters, SRAMs, EEPROMs, and a RAM clock/calender. PC is very popular in Europe, where Philips has been aggressively marketing this flexible method of extending peripheral support to control projects, and it is currently catching fire on this side of the Atlantic. It seems reasonable to expect that, given the burden of printedwire requirements for embedded systems based on increasingly wider chip buses, more and more designers seeking economy of means will be attracted to the economy of PC. Steven Sarns is the president of Vesta Technology in Wheat Ridge, Cola, He is a member of Mensa, Intertel, and the Michigan Society of Professional Engineers. Sarns is also a founding member of the Denver chapter of the Forth Interest Group. Jack Woehr is a senior project manager at Vesta Technology Inc, in Wheat Ridge, Colo. He is a Chapter Coordinatorfor the Forth Interest Group and is currently a member of the X3Jl4 Technical committeefor ANS Forth. He can be reached by E-mail as jax@wll.sf .ca.us or as VESTA on GEnie. Reprinted with permission from EMBEDDED SYSTEMS PROGRAMMING, September 0 I991 MILLER FREEMAN PUBLICATIONS 76 1991 Philips Semiconductors l*C Specific information Bit-Banging Serial Ports Bit-Banging Serial Posts T hey say that necesstty IS the mother of inventron. and it certainly seems to be the case in embedded systems work. No sooner do you accomplish the tmposstble in one project than your boss or customer asks you to do it agam, only faster and cheaper this time. Even when you're working with low-cost mtcrocontrollers. there's still that mcenttve to make things cheaper through magtc software. Performmg mrracies through software trickery IS a skill that a11 embedded developers must cultivate. An opportumty for me to practice such tricks came m the form of a project using the Signettcs 8x75 I mtcrocontroller. The 8x75 1is an 805 1dertvattve that has no internal serial port-no attachment of SBUF shift registers to RxD and TxD. no diverston of timers to baud rate pacmg, no serial interrupts. But the chip is low-prtced and offers a small-fcotprmt, and hence IS desirable in many applicattons. Where the price or size outweighs the need for a simple sertai port, one must be butlt out of firmware by appropriately controlling a single bit m a port. The practice is affecttonateiy known as tibu-bangmg." The approach I'll describe here has the advantages of being simple and fast. There is no transmtt state-machine, no spectai provtston for start and stop bits. 2nd it takes less than two dozen machine cycles for each bit. It has a further Advantage that the data doesn't need to be specrafly orgamzed for transmtttmg. That 1s. the bits that are adjacent m the transmit data stream don't need to be adjacent when they are stored m memory. This soiutton IS for a transmttter only, but I have used a stmtlar procedure for receiving. The shift (or rotate) operation is the first thing that comes to mind when you're designing code to provide a serial data output. My project was required to operate dt 9600 baud. This rate gives a per-bit time of 104 mtcroseconds. or IO4 cycles if you're using a 12-AMHz part. The application in question had plenty of other acttvtties as well as a sertal port (such as reading a serial analog-to-digital converter. performing averages. and so on ), so it was imperative that the sertal port handling take an absolute munmum of ttme. Since I chose to execute in a tixedtime ioop (to avotd interrupt overhead), rt was dfso a godi that the code take a fixed amount of time regardless of the current transmrt state. THESTRUCTURE POINTERSOLUTION G enerally, the shift (or rotate) operation is the first thing that comes to mind when you're designing code to provtde a sertai data output-the format of the data suggests such a scheme. With this approach. however. special states and a counter are needed to provide the start and stop bits dnd to sequence through the set of bytes 77 to be transmitted. The method presented here provides an array of structures (in the cede or PROM space) that defines the transmit sequence bit by bit and uses a pointer to this array as the only controlling eiemerit. This means that only two bytes of scarce internal RAM is used. Philips Semiconductors Bit-Banging i*C Specific information The structures are referenced consecutlvely. Each gves the source of a bit to be transmitted and a flag to indicate whether the pointer should be increased to point to a new bit. The transmission is terminated by having a structure that refers to an *idle" bit and does not increase the pointer. Transmission is initi- ated by changing the pointer to point to the first structure. Start and stop bits are not distinguished from data bits. The bit update prtion of the code is constant-time, and the pointer update can be easily padded if necessary to achieve this part of the goal. Franklin's C5l compiler was used Serial Ports for the work described here. The 8x75 1 does not support external RAM. so the small model is used. (If the transmit data resided in external RAM. the algorithm could be applied, but would be expected to take a little longer to execute.) THE DECLARATIONS he structure that provides individual bit definitions is: T // tranat bit-reference struct BR ; structure : unsigned char Index unsigned char mask : unsigned char bump : No memory is allocated by this definition-it is essentially a typedef. The actual allocation and initialization are provided by the definition (in a header file, send _ seq.h, in this case) of the BitRefarray: code struct BR BitRef[41] = ' 1; wherethedetails will be given in a moment. The pointer is defined as: I/ pointer to BitRef structure array data struct BR code *BR_ptr In Franklin's C5 I, the declaration tokens are interpreted as follows. In the struct BRdeclaration. the token code assigns the BitRef array to program memory (which is then accessed with themovc instruction).Inthe *BR_ptrdeciaratIon, the token codeimpliesthatBR_ptrisexcluslvelyapointertotheprogramspace. so~trequiresonlytwo bytes to becomThe token data causes the compder to store the pointer value m pletely defined. 7a Philips Semiconductors l*C Specific information internal RL41M.(Since I was using the small model, this would have been the default storage anyway.) The Index entry in each structure allows the serial bit to be selected from an array of bytes called transmlt[4] in my case. The transmtt array can. if desired, be set up to literally overlay all of the internal memory, so that the maximum &random access" can be achteved. This was not necessary in my c3se. The physical port pm to be exercised ISdefined: transmit 1s on P3 /* Bit-Banging The *bump" ISa flag that continues the transmtsston. When rt IS finally 0, the serial output sequence ~111stop. 3 , 3 3 b01000000 , 1. b10000000 1. b10000000 0 I/ 0 fixed /I 1 stop bit /I 1 idlebit (The *`masks- are given in binary notation. [See ".A Binary Upgrade jbr C, " pp. 60-62.-Ed.] Because of myassembler and hardware background. thts no- 3 *I code struct BR8itRef[41] = , 11 index comnent 3 b01000000 bump 1. I T (bit)( transmit[ BLptr-1index 0 start bit 'I b00000001 , I, /'I 06 1 b00000010 I, I/ 07 1 b00000100 I, f/ 08 1 b00001000 1. 1 09 1 b00010000 1. /I 010 1 b00100000 1. I/ 011 3 b10000000 I, (1 3 b10000000 1. '1 1 fixed 1 fixed 3 b10000000 I, / 3 b01000000 I, '/ 0 start bit 1 stop bit ] 8 BLptr-)mask ) (b) if ( BR_ptr-)bump ) BR_ptr++ The program looks like this: E mask THE CODE he code fragment that accomplishes the transmtssion IS: (a) VansBit = sblt TransBlt = OxB3 THE STRUCTURE INITIAMATION ach bit to be transmrtted is defined by an index and mask. These are irntiaiized rn the BitRef structure so that characters can be formed as desired in the output btt stream. The index IS the offset wtthin the transmit array. The mittaiizatton n-r my case, for a sequence of 40 bits making up four characters. was: Serial Ports sequence The "bump" is a BR_ptr-1index flag that continues transmit[index] for sectIon (a) -- looks up current index, then used in -- to get byte with desired bit. the transmission. then ANDed with mask BR_ptr-)mask When it finally -- to get zero~nonzero value. which (bit)(value 8 value) reaches 0, the -- 1s then cast to a bit for output TransBlt = bit serial output -- to port pin, the ultimate goal. sequeflce will The pointer IS lncrcdsedin stop. tation 1s natural for me tn bit mask references.) The *`index" refers. as menttoned. to the element of "transmit" in which the bit restdes. Some mitializatton code has guaranteed that the upper two bits of transmlt[3] wtll be IO. so that they can be referred to for start and stop bits and for any tixed-vaiue bits that happen to be m the data stream (in my case, the fixed bits are used to indicate data byte order). 79 (b),depend- tngon the value of BR__Ptr-jbump. 2~s indicated earlier. this is aiwavs one except in the last structure, so the serial transmrsston always proceeds to the defined end. The statement: BR_ptr = &BitRef[40] in lnitiaiization ~111 keep ter offdurmgstartup, the transmtt- and: BR_ptr = BitRef IS used to initiate sequence. a transmtssron Philips Semiconductors Bit-Banging Serial Ports 12C Specific information The previous transmitting code corna Me manual assls- plies. with only tance, to: The assembly language code reveals that the mechamsm is pretty efficient. This method is in use in one of my clients' prducts and has proved effective. TransEit = (bit)( transnnt[ BR_ptr-Findex ] 8 BR_ptr-Unask ) WV DPL.BR_ptr+OlH MIV 0PH,Bfl_!~tr CLR A WC A.@A+DPlR GlT4ANGlNG WGGKS T his bit-banging solution serves to provide serial h!OV A.&%0 transmission m embedded system that has no hardware specifically dedicated to [he function. Though alternate and more traditional solutions would have worked. the need for speed encouraged KIV R7.A development INC DPTR solution that works fast enough CLR A case Mavc A,@A*DPTR ADD A,ntransmit KIV RO A ANL A.R7 ADD A.nOFFH MOV TransB1t.C If ( BRdtr-)bm DPTR CLR A Move A,@A+DPTR JZ XDOII BR_ptr++ A.aD3H ADD A.BR_ptr+OlH WV BR_ptr+OlH,A CLR A ADDC A.BR_ptr MOV BRatr.A ~cooll of a code-pomter-based m this and takes up only two internal RAM bytes for operation. I hope that thispresentation will prove to be useful for you. ) INC MOV an Mark Gardner 1s a consultant based in Acton, CA. He has been designing hardware and writingfirmwarefor embedded svstems for over 15 years. He has an h4S in eiectronx engineering jborn the University of Illinois. For more mformanon. contact: PhilipsSemiconductors I E. Arques Avenue P.O. Box 3409 Sunnyvale. CA 94088-3409 (408) 991-352 81 ReprInted wnh permissIon from EMBEDDED SYSTEMS PROGRAMMING, 0 1993 MILLER FREEMAN INC. 80 September 1993