MOS vs BIP

This post is not supposed to be very comprehensive when comparing the MOS with the BIP (bipolar) transistor. However, I thought it could be nice to have it outlined on a single sheet here. MOS stands for metal oxide semiconductor.

Inte picture below we find Lilienfeld vs Shockley. (Yes, yes, yes, one can argue about who did what, etc., but I will let them represent the two transistors). Lilienfeld represents a more square symbol and a “simpler” expression of the current as function of input voltage (in its desired operating region): a polynomial – the square of the input voltage, ube. Shockley presents an exponential function instead – the diode equation. I have sketched the currents as functions of the input voltage and we see that even though the square (MOS) is stronger in the beginning, the exponential quickly comes up to pace and produces higher currents.

The MOS layout is more compact compared to the bipolar. The MOS offers an infinite input resistance (well). The bipolar does not. In fact it must have an input current to operate as desired.

Considering the small signal schematics, the parameters gm (transconductance), gds/go (output conductance) and gp (input conductance) can all be derived as dependent on the current through the transistor. The higher current, the higher everything, sort of. Arguably, this implies that the gain is more or less with current.
At the bottom of the figure, we find the intrinsic gain of the transistors.

For the MOS it is the Early voltage over the effektive input voltage, i.e., gate-source voltage minus the threshold voltage. For the bipolar it is the Early voltage over the thermal voltage (~26 mV). These two gain expressions actually tell us that it is quite likely that the gain is higher for the bipolar than the MOS! (This can also be seen from the MOS transistor operating in the subthreshold region).

Why larger? It is hard to push down the MOS effective voltage to the required 52 mV to match the bipolar relying on the thermal voltage.

Another glance on the MOS transistor equations

A few posts ago, I created the first part of an EKV-model discussion and we had a somewhat earlier post on MOS modeling. I haven’t compiled part two yet, but thought we could quickly revisit the Schichman-Hodges model and just look at the formulae again, and get a feeling for parameters and relationships. Also revisiting a unified expression for the current in “all” operating regions of the transistor.

Schichman-Hodges

For hand-calculations of MOS transistors we still, kind-of, use the Schichman-Hodges model from 1968 for the three regions (cut-off, linear, and saturation):

$I_D = 0$
$I_D = \frac{\mu_0 \cdot C_{ox} } { 2 } \cdot \frac{W}{L} \cdot \left( 2 \left( V_{gs} - V_{th} \right) - V_{ds}^2 \right)$
$I_D = \frac{\mu_0 \cdot C_{ox} } { 2 } \cdot \frac{W}{L} \cdot \left( V_{gs} - V_{th} \right)^2$

where the parameters are effective carrier mobility, oxide capacitance per area, effective width and length of channel, gate-source voltage, drain-source voltage, and threshold voltage. Of course there should be large approximation signs, due to many reasons as discussed before. The transistor operates in different regions dependent on the voltages across it. See table further down below.

Also, the channel length modulation, lambda, is not included in the formula either.

Anyhow, I’ve always thought the formulas look a bit skewed and not that unified, arguably since they originally aimed at distinct operating regions. The drive and supply voltages were considerably higher than the threshold voltage and the chance/risk of the transistors sliding over between the different regions was very small.

There are suggestions on ways to glue the three regions together and use some indeces back and forth. We can for example introduce the channel length modulation for all three equations. The channel length modulation is (for hand calculations) assumed to be very small, and in the linear region, the inherent low drain-source voltage drop will further diminish its impact.

$I_D = 0 \cdot \left(1 + \lambda \cdot \left( V_{ds} - V_{ds, sat} \right) \right)$
$I_D = \frac{\mu_0 \cdot C_{ox} } { 2 } \cdot \frac{W}{L} \cdot \left( 2 \left( V_{gs} - V_{th} \right)\cdot V_{ds} - V_{ds}^2 \right) \cdot \left(1 + \lambda \cdot \left( V_{ds} - V_{ds, sat} \right) \right)$
$I_D = \frac{\mu_0 \cdot C_{ox} } { 2 } \cdot \frac{W}{L} \cdot \left( V_{gs} - V_{th} \right)^2 \cdot \left(1 + \lambda \cdot \left( V_{ds} - V_{ds, sat} \right) \right)$

where I use the drain-source saturation voltage, which Sedra omitted.

$V_{ds, sat} = V_{gs} - V_{th}$

Let us consider the linear region first and do some manipulations

$\alpha = \frac{\mu_0 \cdot C_{ox} } { 2 } \cdot \frac{W}{L}$
$M = \left(1 + \lambda \cdot \left( V_{ds} - V_{ds, sat} \right)\right)$

$I_D = \alpha \cdot \left( 2 \left( V_{gs} - V_{th} \right)\cdot V_{ds} - V_{ds}^2 \right) \cdot M =$
$= \alpha \cdot \left(\left( V_{gs} - V_{th} \right)^2 - \left( V_{gs} - V_{th} \right)^2 + 2 \left( V_{gs} - V_{th} \right)\cdot V_{ds} - V_{ds}^2 \right) \cdot M =$
$= \alpha \cdot \left(\left( V_{gs} - V_{th} \right)^2 - \left( V_{gs} - V_{th} - V_{ds} \right)^2 \right) \cdot M =$
$= \alpha \cdot \left(\left( V_{gs} - V_{th} \right)^2 - \left( V_{gd} - V_{th} \right)^2 \right) \cdot M =$
$= \alpha \cdot \left(\left(\underbrace{ V_{gs} - V_{th}}_{V_{gs, eff}} \right)^2 - \left( \underbrace{ V_{gd} - V_{th} }_{V_{gd, eff}} \right)^2 \right) \cdot M =$
$= \alpha \cdot \left( V_{gs, eff}^2 - V_{gd, eff}^2 \right) \cdot M$

And actually, we also see that

$V_{ds} - V_{ds, sat} = V_{ds} - (V_{gs} - V_{th}) = -V_{gd} + V_{th} = - V_{gd,eff}$

which compiles the equations as

$I_D = \frac{\mu_0 \cdot C_{ox} } { 2 } \cdot \frac{W}{L} \cdot 0 \cdot \left(1 - \lambda \cdot V_{gd, eff} \right)$
$I_D = \frac{\mu_0 \cdot C_{ox} } { 2 } \cdot \frac{W}{L} \cdot \left( V_{gs, eff}^2 - V_{gd, eff}^2 \right) \cdot \left(1 - \lambda \cdot V_{gd, eff} \right)$
$I_D = \frac{\mu_0 \cdot C_{ox} } { 2 } \cdot \frac{W}{L} \cdot V_{gs, eff}^2 \cdot \left(1 - \lambda \cdot V_{gd, eff} \right)$

and one could now introduce a couple of binary parameters such that all regions could be covered in one go.

$I_D = \frac{\mu_0 \cdot C_{ox} } { 2 } \cdot \frac{W}{L} \cdot r_{on} \cdot \left( V_{gs, eff}^2 - r_{lin} \cdot V_{gd, eff}^2 \right) \cdot \left(1 - \lambda \cdot V_{gd, eff} \right)$

and the table compiles the choices. Now we have only two voltages to consider, and the different regions, and the physical parameters, of course.

Region $r_{on}$ $r_{lin}$ Conditions
Cut-off 0 $V_{gs,eff} < 0$
Linear 1 1 $V_{gs,eff} > 0, V_{gs,eff} > V_{ds,eff}$
Saturation 1 0 $V_{gs,eff} > 0, V_{gs,eff} < V_{ds,eff}$

Notably omitted

As always, the brasklapp, reservation is that the formulas above are far from accurate today, and we have not consider for example bulk effects and many, many more effects. However, the target was just to give another glance on the hand calculations and if there are ways to write them just a bit more intuitively…

How much is a 24-bit converter?

How much is a 24-bit converter?

A while ago we supervised the design of a linearization module for an R/2R digital-to-analog converter (DAC). It was an interesting piece of work. In general, the resolution was very high and for example, for audio applications, if we consider a 24-bit converter – how much is that? With this post, I just wanted to inspire some thinking…

Well, if we consider a 24-bit converter, the number of levels is simply

$L = 2^{24} = 16777216$

so something like 17 megalevels or so. Is that much? Well, “yes”? They say that humans have difficulties visualizing grand scales and I am not sure my picture below will help, but hopefully.

Let us assume our reference range is Mount Everest. Now, I understand that not most of you have seen Mount Everest, but yet. The world’s tallest mountain. We can then add for example Burj Khalifa to the picture, etc. (Not all of us have seen that either, but it is at least approximately 10x smaller).

If we have an object of length x and assume that Mount Everest is A = 8848 meters high, we can express the impressive number of bits (INOB) as

$\text{INOB} = \log_2 \left( {2^{24} \cdot \frac{x}{A} } \right) = \log_2{2^{24}} + \log_2 x - \log_2 A = 24 - 13.1 + \log_2 x \sim 10.9 + \log_2 x$

Burj Khalifa thus has an INOB of approximately 20.6 bits (and Mount Everest would be 24 bits). The picture below tries to visualize this in a good manner.

But perhaps we need one of thoste tables to make life even better.

Thing INOB
Mount Everest 24
Burj Khalifa 20.6
Airbus 380 17.1
J Jacob Wikner 11.8
A bottle 9.2
An inch 5.6
1 mm 0.9

Or if we would now think in terms of the Earth’s circumference (which is approximately 40000 km), this means that you have to take a walk (or some suitable means of transport) to walk along the equator. Every 2.3 m-ish you have to stop and place a level-indicator pole of some kind.

Well, that’s in meters, what about good-old Volt?

Now, assume we define the potential drop from top to bottom of Mount Everest to be 1 Volt. Which in turn means that the least-signficant voltage level is 60 nV.
Also going back to our audio case and say that the bandwidth of interest is 22 kHz. We can now compare the levels with the Johnson-Nyquist noise (kT).

$v_{nn} = \sqrt{ k T \cdot 22000 } \sim 9.54 nV_{rms}$

and the level of the quantization noise becomes

$v_{nq} = \sqrt{ 60^2 / 12 } \sim 17.21 nV_{rms}$

If the voltage swing of our Mount-Everest-DAC is 1 Volt, we are then sniffing around levels of the fundamental noise limit (for the given bandwidths used).

Protect yourself!

How come the most common virus on my computer is the antivirus program itself?

“You are not protected. Act now!

Well, maybe Roland Emmerich is right…

Another bound on power consumption in DACs

I wanted to go back to the previous post where we investigated the trade-offs between speed, resolution and power consumption in a digital-to-analog converter (DAC). That gave us a bound in that triangle.

What if we use another starting point in our argument rather than noise?

Same type of DAC

Consider the DAC in the figure. Same type of DAC as last time. We have a current through a resistor forming the voltage. What is the minimum possible quanta through the resistor? Leading question … Assume the least significant bit is determined by one single electron during the sampling period.
The average current for a single electron is given by

$I = q_0 / \Delta T = q_0 \cdot f_s$

where $f_s$ is the sample frequency and $q_0 = 0.1602$ aC is the elementary charge of the electron. (Let us ignore quantum effects and those things. Let us be traditional and assume we know where the electron is, before we open the box with the cat, …)

Small value?

So, is this a small or a large value?
Well, assume a resistive load of $R_L$. The (average) voltage for that particular charge, corresponding to one least significant bit, would be

$\Delta V = R_L \cdot I = R_L \cdot q_0 \cdot f_s$

Assume 100 Ohms in the load, and a sample frequency of 600 MHz. We get

$\Delta V = 100 \cdot 0.1602 \cdot 10^{-18} \cdot 600 \cdot 10^6 = 10^{-8}$

which is 10 nV. This would correspond to something like:

• In a 16-bit converter, this would mean that the peak voltage is some 655 uV.
• In a 20-bit converter, sampled at 1.2 GHz, the peak voltage is some 21 mV.

The voltages can simply not be less than that for the given sample frequencies and resolutions. Otherwise we have to split the electron (buying Swedes a cake).

I think it is actuall rather interesting: the faster you sample, the less electrons will be there for each least significant bit (LSB).

Full-swing signal

With a full-scale sinusoid signal in place, we can find the average power as

$P = V_{ref}^2 / R_L/8 = 2^{2 N - 3} \cdot \frac{q_0^2}{T^2} R_L = R_L \cdot 2^{2 N - 3} \cdot q_0^2 \cdot f_s^2$

which is then the absolute minimum possible power that must be consumed to obtain a certain resolution.

Are we having fun yet?

Just for the fun of it, let us rewrite the formula a bit

$P = C_L R_L \cdot 2^{2 N - 3} \cdot \frac{q_0 }{kT} \cdot \frac{kT} {C_L} \cdot q_0 \cdot f_s^2 = 2^{2 N - 3} \cdot \frac{q_0 }{kT} \cdot \frac{kT} {C_L} \cdot q_0 \cdot f_s / \pi$

where we have assumed that we also have to guarantee the bandwidth, not just the sample frequency and voltage levels.
In the equation, we can identify the 26-mV term (q/kT), a noise power kT/C, and some constants. Possibly, it could be related to the previous post. Philosophically (?) one could also think what the thermal noise looks like when we push single electrons back and forth.

How fast do we need to sample over a 100-Ohm load to get a 1-V drop with a single electron? (Once again, from a mathematical point of view, and possibly not the correct physical description of the scenario).

$V = R_L \cdot I = R_L \cdot q_0 \cdot f_s = 1$

$f_s = 1 / (R_L \cdot q_0) = 62 \cdot 10^{15}$

Ok, so 60 PHz is rather fast… if we would correlate with light, it would end up in the UV domain, 100 times less than visible light. And now we kind of entering the photoelectric effects, sort of …

Why doesn’t the gain change in my CMOS common-source amplifier?

Background

The other day I provided the students in my course: “TSTE08 Analog and discrete-time integrated circuits” a quiz. Perhaps it was a bit of a cryptical one, but I wanted to point out some of the difficulties with the current-voltage relationship in an analog amplifier, and the complexities in the choice of electrical vs. physical design parameters.

So, with this post I hope to give you both an insight in that quiz, but also an insight in a clever (?) way to set the DC operating points in your circuit…

The quiz

The quiz related to the common-source amplifier. The input is typically connected to the NMOS and a PMOS forms as active load. Moreover:

Assume I have a common-source amplifier with an active load. Further assume that the output and input DC voltages are fixed.
What should I do to increase the DC gain of my amplifier? Use hand-calculation formulas.

1) Increase the current through the amplifier
X) Increase the width of the transistors
2) Increase the length of the transitors

Any combination of those three could be correct.

So, perhaps the quiz it was a bit cryptical, yes.

Solution

First suggestion on how to attack the problem: find the desired relations you need. For example, the gain, $A_0$ is something like:

$A_0 = \frac{g_{mn}}{g_{out}} = \frac{g_{mn}}{g_{p}+g_n} = \frac{\frac{2 I_D }{V_{EFF}}{}}{\lambda_p I_D + \lambda_n I_D} = \frac{1}{\frac{\lambda_p + \lambda_n}{2} \cdot V_{EFF}} = \frac{1}{\lambda \cdot V_{EFF}}$

where we can clearly see that it is “independent” (!) on the current through the amplifier. The

$V_{EFF} = V_{GS} - V_{Tn} = V_{i n} - V_{Tn}$

is the effective input (DC) overdrive voltage (which we cannot touch as per the quiz!). We also know from “hand calculations” that

$I_D = \alpha \cdot V_{EFF}^2$

and since $V_{EFF}$ cannot change we also see that $I_D / \alpha$ obviously has to be constant.
Technically, we could increase both current and size such that the ratio is kept constant. However, it still does not help if we look at the gain expression above. The gain is independent on the ratio, if the effective voltage remains constant.
Options 1 and X are no valid options.

Also here we need to look at another common relation for a MOS transistor: we know that with longer widths, the channel length modulation reduces, so we get:

$\lambda \propto 1 / L$

such that

$A_0 \propto \frac{L}{ V_{EFF}}$

the channel length pops up in the numerator.

This means answer 2 is correct.
By increasing the channel length of the transistors, we effectively increase the output impedance and also increase the gain without touching the DC level (hand calculations).

Is that really correct?

Well, is it really correct? Maybe not super-duper correct if we take all second- and third-order effects into account. There are probably small variations to the gain if we change current and width.

Let us hook up a testbench. Below we find a common-source stage with a somewhat cryptical circuit in the box on the top. That is actually a tuned current source that guarantees that the the operating points, input and output DC voltages, are kept constant. The loop with the vcvs (amplifier) will increase the current through the circuit and make the output voltage follow the reference voltage in the top left. If we also make sure that the transistors are sized well to operate in saturation region for the sweeps we will do shortly, we are more or less fine to prove our point.

Then we run a DC input voltage sweep and an AC input sweep. Below we find the frequency response on the left and the DC response on the right. The left-hand figure tells us that the DC gain is somewhere around 25dB-ish. From the right-hand figure, we see that the output DC voltage stays stuck at some 0.6 V.

Varying widths (effectively changing current)

We do a parametric sweep and capture the DC gain value (from the frequency sweep at 10 Hz) and plot it as a function of the transistor width. (Sorry for the thin line here, you might have to press the picture for a more clear view.) We see that the gain increases from some 23.5 to some 24.7 dB when increasing the width 100 (!) times. In a linear scale that is a very small change for such a large variation and practically there is no (significant) change in gain.

Varying lengths

Then we do a parametric sweep and capture the DC gain value, still at 10 Hz from the frequency sweep, and plot it as a function of the transistor length. (Sorry for the thin line here, you might have to press the picture for a more clear view.) The gain now increases 17.5 to 28 dB when increasing channel length 10 (!) times. In a linear scale this is now a very significant change, almost four times (12 dB).

A comment on the test bench

The trick in the test bench is quite useful actually. It is much more convenient than running the circuit in open loop to try to find the absolute settings. If the gain is very high, you have to do quite a few sweeps around the operating point in order for the simulator to have you find the best point.

The trick is also the convenient sp2tswitch that Cadence/spectre provides to simulate a circuit in different conditions. The AC simulation (which is run after DC) will inherit the DC settings and run the AC analysis around those operating points. Notice that I had to connect the two inputs of the vcvs (essentially an differential operational amplifier with a gain of 1000) to the reference voltage during AC to force the gate of the PMOS to be quiet. It is not clear why this was needed.

Can we trust the models?

I am preparing this years version of the analog integrated circuit courses (TSTE08 and TSEI12. For this purpose, I need to tweak some of the model cards for the simulator. We do not need the most fancy processes to demonstrate analog circuit design in the courses.

However, while doing this I revisited one old post:

as a way (thanks, Aamir) to plot e.g. the transconductance and output conductance of e.g. a common-source circuit. That is quite powerful in case you want to demonstrate the importance of choosing your operating region or operating point in general. With the above mentioned post, we can plot the parameters, such as the operating region!, as function of e.g. input voltage, etc.

I’ve got a bit confused by the results first after realizing that I was invoking a level-1 model of the MOS in my testbench. Think Shichman-Hodges, hand calculations, if-statements, etc. There are so much material on the properties of the different models and I do not intend to touch upon them here, instead I serve a small comparison…

Consider the testbench below. It is a common-source stage with an NMOS driving transistor with an active, current-mirror load, where we set the current with an ideal current source, ie., forcing current through the drive transistor. We then want to sweep the input transistor DC voltage to find an operating point of interest (or at least get an idea of the operation).

I will now switch in different models in my model card, and I also append one of those magic extras in an additional file to be able to plot what I want:

 save Mdrive:region save Mdrive:gds save Mdrive:gm 

Level 1

Let us look at the results for different models. First, we start with level 1. One of the most basic models and used for old technologies. We plot the gain (gm/gds), the output voltage (vOut), the transistor’s operating region, the transconductance (gm), and the conductance (gds) as function of the input DC voltage, vInDc.

We can for example see how the transistor sweeps through different operating regions (green, brickwall): cut-off (0) – subthreshold (3) – saturation (2) – linear (1). The transconductance (gm, yellow) follows a more or less linear curve as soon as we enter the saturation region. (Notice that gm = alpha x Veff according to the good-old hand-calculations.

The oddity in this simulation is the gain (blue). Notice that it is plotted in logarithmic scale (!) and for low input DC voltage the gain is huge ~ 30000. We even see that we do have some divide-by-zero happening for low values. Clearly this indicates something unrealistic with the model.

Level 2

Let us see what level 2 can offer. This is the so-called Grove-Frohman model (Google! Interesting guys.) Below we find a similar picture. In this case, the gain looks much more moderate. The treshold voltage is different for this level, which implies a shift of the region towards higher voltages. We see a more soft behavior in the transconductance, but still – at the shift from subthreshold to saturation region (around 0.65 V vInDc) – we see a tendency of a discontinuity. (Notice that we have a finer resolution in vInDc than illustrated by the tick-marks in the graphs).

Level 3

Level 3 – more based on empirical results. Similarly here, we see a discontinuity around the shift from cut-off/subthreshold to the saturation region. Even on the gain curve, we see a clear peaking indicating something strange. It would be more realistic to think of the transition as something continuous. Remember that the blue gain curve is still in a logarithmic scale. The peak hits some 40000 times of gain. The transconductance (yellow) is not all linear as in level 1.

Level 49

Let’s switch to level 49. This is a more modern model and is also called he BSIM3v3 (which comes in different flavours…) Once again the threshold voltage is different. The nice thing here now is that we see a smooth transition between the operating regions – especially from the subthreshold to saturation range. One would think that it is a more accurate model of the real-life transistor, but of course we cannot be all sure.

The gain seems more realistic, no sharp spikes or jumps, and settling towards a final value in a smooth fashion, both for low and high input voltages.

Notice that the threshold voltages are not identical for all the different levels, as well as mobilities, etc., as such, the models are not comparable to – in this case – make a true judgment what’s the most correct model. The same holds for my graphs which have different scales and some with zoom adjusted, etc. Just look for the tendencies.

Different transistors models also try to model different physical phenomena and I will leave it to you to do the research in all books out there.

So, in short – check what you are simulating. Did you switch in correct netlist, correct parameters. Are there discontinuities in the curves? Probably, there shouldn’t be any. Etc.