-
SLAP: A Split Latency Adaptive VLIW pipeline architecture which enables on-the-fly variable SIMD vector-length
Authors:
Ashish Shrivastava,
Alan Gatherer,
Tong Sun,
Sushma Wokhlu,
Alex Chandra
Abstract:
Over the last decade the relative latency of access to shared memory by multicore increased as wire resistance dominated latency and low wire density layout pushed multiport memories farther away from their ports. Various techniques were deployed to improve average memory access latencies, such as speculative pre-fetching and branch-prediction, often leading to high variance in execution time whic…
▽ More
Over the last decade the relative latency of access to shared memory by multicore increased as wire resistance dominated latency and low wire density layout pushed multiport memories farther away from their ports. Various techniques were deployed to improve average memory access latencies, such as speculative pre-fetching and branch-prediction, often leading to high variance in execution time which is unacceptable in real time systems. Smart DMAs can be used to directly copy data into a layer1 SRAM, but with overhead. The VLIW architecture, the de facto signal processing engine, suffers badly from a breakdown in lockstep execution of scalar and vector instructions. We describe the Split Latency Adaptive Pipeline (SLAP) VLIW architecture, a cache performance improvement technology that requires zero change to object code, while removing smart DMAs and their overhead. SLAP builds on the Decoupled Access and Execute concept by 1) breaking lockstep execution of functional units, 2) enabling variable vector length for variable data level parallelism, and 3) adding a novel triangular load mechanism. We discuss the SLAP architecture and demonstrate the performance benefits on real traces from a wireless baseband system (where even the most compute intensive functions suffer from an Amdahls law limitation due to a mixture of scalar and vector processing).
△ Less
Submitted 25 February, 2021;
originally announced February 2021.
-
Towards a Domain Specific Solution for a New Generation of Wireless Modems
Authors:
Alan Gatherer,
Ashish Shrivastava,
Hao Luan,
Asheesh Kashyap,
Zhenguo Gu,
Miguel Dajer
Abstract:
Wireless cellular System on Chip (SoC) are experiencing unprecedented demands on data rate, latency use case variety. 5G wireless technologies require a massive number of antennas and complex signal processing to improve bandwidth and spectral efficiency. The Internet of Things is causing a proliferation in the number of connected devices, and service categories, such as ultra-reliable low latency…
▽ More
Wireless cellular System on Chip (SoC) are experiencing unprecedented demands on data rate, latency use case variety. 5G wireless technologies require a massive number of antennas and complex signal processing to improve bandwidth and spectral efficiency. The Internet of Things is causing a proliferation in the number of connected devices, and service categories, such as ultra-reliable low latency, which will produce new use cases, such as self-driving cars, robotic factories, and remote surgery. In addressing these challenges, we can no longer rely on faster cores, or even more silicon. Modem software development is becoming increasingly error prone and difficult as the complexity of the applications and the architectures increase.
In this report we propose a Wireless Domain Specific Solution that takes a Dataflow acceleration approach and addresses the need of the SoC to support dataflows that change with use case and user activity, while maintaining the Firm Real Time High Availability with low probability of Heisenbugs that is required in cellular modems. We do this by developing a Domain Specific Architecture that describes the requirements in a suitably abstracted dataflow Domain Specific language. A toolchain is described that automates translation of those requirements in an efficient and robust manner and provides formal guarantees against Heisenbugs. The dataflow native DSA supports the toolchain output with specialized processing, data management and control features with high performance and low power, and recovers rapidly from dropped dataflows while continuing to achieve the real time requirements.
This report focuses on the dataflow acceleration in the DSA and the part of the automated toolchain that formally checks the performance and correctness of software running on this dataflow hardware. Results are presented and a summary of future work is given.
△ Less
Submitted 4 December, 2020;
originally announced December 2020.
-
Combinatorics and Geometry for the Many-ported, Distributed and Shared Memory Architecture
Authors:
Hao Luan,
Alan Gatherer
Abstract:
Manycore SoC architectures based on on-chip shared memory are preferred for flexible and programmable solutions in many application domains. However, the development of many ported memory is becoming increasingly challenging as we approach the end of Moore's Law while systems requirements demand larger shared memory and more access ports. Memory can no longer be designed simply to minimize single…
▽ More
Manycore SoC architectures based on on-chip shared memory are preferred for flexible and programmable solutions in many application domains. However, the development of many ported memory is becoming increasingly challenging as we approach the end of Moore's Law while systems requirements demand larger shared memory and more access ports. Memory can no longer be designed simply to minimize single transaction access time, but must take into account the functionality on the SoC. In this paper we examine a common large memory usage in SoC, where the memory is used as storage for large buffers that are then moved for time scheduled processing. We merge two aspects of many ported memory design, combinatorial analysis of interconnect, and geometric analysis of critical paths, extending both to show that in this case the SoC performance benefits significantly from a hierarchical, distributed and staged architecture with lower-radix switches and fractal randomization of memory bank addressing, along with judicious and geometry aware application of speed up. The results presented show the new architecture supports 20% higher throughput with 20% lower latency and 30% less interconnection area at approximately the same power consumption. We demonstrate the flexibility and scalability of this architecture on silicon from a physical design perspective by taking the design through layout. The architecture enables a much easier implementation flow that works well with physically irregular port access and memory dominant layout, which is a common issue in real designs.
△ Less
Submitted 16 October, 2020;
originally announced October 2020.
-
Advanced Receiver Architectures for Millimeter Wave Communications with Low-Resolution ADCs
Authors:
Jinseok Choi,
Gilwon Lee,
Ahmed Alkhateeb,
Alan Gatherer,
Naofal Al-Dhahir,
Brian L. Evans
Abstract:
Employing low-resolution analog-to-digital converters (ADCs) for millimeter wave receivers with large antenna arrays provides opportunity to efficiently reduce power consumption of the receiver. Reducing ADC resolution, however, results in performance degradation due to non-negligible quantization error. In addition, the large number of radio frequency (RF) chains is still not desirable. According…
▽ More
Employing low-resolution analog-to-digital converters (ADCs) for millimeter wave receivers with large antenna arrays provides opportunity to efficiently reduce power consumption of the receiver. Reducing ADC resolution, however, results in performance degradation due to non-negligible quantization error. In addition, the large number of radio frequency (RF) chains is still not desirable. Accordingly, conventional low-resolution ADC systems require more efficient designs to minimize the cost and complexity while maximizing performance. In this article, we discuss advanced low-resolution ADC receiver architectures that further improve the spectral and energy efficiency tradeoff. To reduce both the numbers of RF chains and ADC bits, hybrid analog-and-digital beamforming is jointly considered with low-resolution ADCs. We explore the challenges in designing such receivers and present key insights on how the advanced architectures overcome such challenges. As an alternative low-resolution ADC receiver, we also introduce receivers with learning-based detection. The receiver does not require explicit channel estimation, thereby is suitable for one-bit ADC systems. Finally, future challenges and research issues are discussed.
△ Less
Submitted 10 May, 2020; v1 submitted 7 March, 2020;
originally announced March 2020.
-
Base Station Antenna Selection for Low-Resolution ADC Systems
Authors:
Jinseok Choi,
Junmo Sung,
Narayan Prasad,
Xiao-Feng Qi,
Brian L. Evans,
Alan Gatherer
Abstract:
This paper investigates antenna selection at a base station with large antenna arrays and low-resolution analog-to-digital converters. For downlink transmit antenna selection for narrowband channels, we show (1) a selection criterion that maximizes sum rate with zero-forcing precoding equivalent to that of a perfect quantization system; (2) maximum sum rate increases with number of selected antenn…
▽ More
This paper investigates antenna selection at a base station with large antenna arrays and low-resolution analog-to-digital converters. For downlink transmit antenna selection for narrowband channels, we show (1) a selection criterion that maximizes sum rate with zero-forcing precoding equivalent to that of a perfect quantization system; (2) maximum sum rate increases with number of selected antennas; (3) derivation of the sum rate loss function from using a subset of antennas; and (4) unlike high-resolution converter systems, sum rate loss reaches a maximum at a point of total transmit power and decreases beyond that point to converge to zero. For wideband orthogonal-frequency-division-multiplexing (OFDM) systems, our results hold when entire subcarriers share a common subset of antennas. For uplink receive antenna selection for narrowband channels, we (1) generalize a greedy antenna selection criterion to capture tradeoffs between channel gain and quantization error; (2) propose a quantization-aware fast antenna selection algorithm using the criterion; and (3) derive a lower bound on sum rate achieved by the proposed algorithm based on submodular functions. For wideband OFDM systems, we extend our algorithm and derive a lower bound on its sum rate. Simulation results validate theoretical analyses and show increases in sum rate over conventional algorithms.
△ Less
Submitted 30 June, 2019;
originally announced July 2019.
-
Robust Learning-Based ML Detection for Massive MIMO Systems with One-Bit Quantized Signals
Authors:
Jinseok Choi,
Yunseong Cho,
Brian L. Evans,
Alan Gatherer
Abstract:
In this paper, we investigate learning-based maximum likelihood (ML) detection for uplink massive multiple-input and multiple-output (MIMO) systems with one-bit analog-to-digital converters (ADCs). To overcome the significant dependency of learning-based detection on the training length, we propose two one-bit ML detection methods: a biased-learning method and a dithering-and-learning method. The…
▽ More
In this paper, we investigate learning-based maximum likelihood (ML) detection for uplink massive multiple-input and multiple-output (MIMO) systems with one-bit analog-to-digital converters (ADCs). To overcome the significant dependency of learning-based detection on the training length, we propose two one-bit ML detection methods: a biased-learning method and a dithering-and-learning method. The biased-learning method keeps likelihood functions with zero probability from wiping out the obtained information through learning, thereby providing more robust detection performance. Extending the biased method to a system with knowledge of the received signal-to-noise ratio, the dithering-and-learning method estimates more likelihood functions by adding dithering noise to the quantizer input. The proposed methods are further improved by adopting the post likelihood function update, which exploits correctly decoded data symbols as training pilot symbols. The proposed methods avoid the need for channel estimation. Simulation results validate the detection performance of the proposed methods in symbol error rate.
△ Less
Submitted 26 August, 2019; v1 submitted 30 November, 2018;
originally announced November 2018.
-
Optimizing Beams and Bits: A Novel Approach for Massive MIMO Base-Station Design
Authors:
Narayan Prasad,
Xiao-Feng Qi,
Alan Gatherer
Abstract:
We consider the problem of jointly optimizing ADC bit resolution and analog beamforming over a frequency-selective massive MIMO uplink. We build upon a popular model to incorporate the impact of low bit resolution ADCs, that hitherto has mostly been employed over flat-fading systems. We adopt weighted sum rate (WSR) as our objective and show that WSR maximization under finite buffer limits and imp…
▽ More
We consider the problem of jointly optimizing ADC bit resolution and analog beamforming over a frequency-selective massive MIMO uplink. We build upon a popular model to incorporate the impact of low bit resolution ADCs, that hitherto has mostly been employed over flat-fading systems. We adopt weighted sum rate (WSR) as our objective and show that WSR maximization under finite buffer limits and important practical constraints on choices of beams and ADC bit resolutions can equivalently be posed as constrained submodular set function maximization. This enables us to design a constant-factor approximation algorithm. Upon incorporating further enhancements we obtain an efficient algorithm that significantly outperforms state-of-the-art ones.
△ Less
Submitted 26 February, 2019; v1 submitted 17 October, 2018;
originally announced October 2018.
-
Antenna Selection for Large-Scale MIMO Systems with Low-Resolution ADCs
Authors:
Jinseok Choi,
Junmo Sung,
Brian L. Evans,
Alan Gatherer
Abstract:
One way to reduce the power consumption in large-scale multiple-input multiple-output (MIMO) systems is to employ low-resolution analog-to-digital converters (ADCs). In this paper, we investigate antenna selection for large-scale MIMO receivers with low-resolution ADCs, thereby providing more flexibility in resolution and number of ADCs. To incorporate quantization effects, we generalize an existi…
▽ More
One way to reduce the power consumption in large-scale multiple-input multiple-output (MIMO) systems is to employ low-resolution analog-to-digital converters (ADCs). In this paper, we investigate antenna selection for large-scale MIMO receivers with low-resolution ADCs, thereby providing more flexibility in resolution and number of ADCs. To incorporate quantization effects, we generalize an existing objective function for a greedy capacity-maximization antenna selection approach. The derived objective function offers an opportunity to select an antenna with the best tradeoff between the additional channel gain and increase in quantization error. Using the generalized objective function, we propose an antenna selection algorithm based on a conventional antenna selection algorithm without an increase in overall complexity. Simulation results show that the proposed algorithm outperforms the conventional algorithm in achievable capacity for the same number of antennas.
△ Less
Submitted 20 April, 2019; v1 submitted 29 January, 2018;
originally announced January 2018.
-
ADC Bit Optimization for Spectrum- and Energy-Efficient Millimeter Wave Communications
Authors:
Jinseok Choi,
Junmo Sung,
Brian L. Evans,
Alan Gatherer
Abstract:
A spectrum- and energy-efficient system is essential for millimeter wave communication systems that require large antenna arrays with power-demanding ADCs. We propose an ADC bit allocation (BA) algorithm that solves a minimum mean squared quantization error problem under a power constraint. Unlike existing BA methods that only consider an ADC power constraint, the proposed algorithm regards total…
▽ More
A spectrum- and energy-efficient system is essential for millimeter wave communication systems that require large antenna arrays with power-demanding ADCs. We propose an ADC bit allocation (BA) algorithm that solves a minimum mean squared quantization error problem under a power constraint. Unlike existing BA methods that only consider an ADC power constraint, the proposed algorithm regards total receiver power constraint for a hybrid analog-digital beamforming architecture. The major challenge is the non-linearities in the minimization problem. To address this issue, we first convert the problem into a convex optimization problem through real number relaxation and substitution of ADC resolution switching power with constant average switching power. Then, we derive a closed-form solution by fixing the number of activated radio frequency (RF) chains M. Leveraging the solution, the binary search finds the optimal M and its corresponding optimal solution. We also provide an off-line training and modeling approach to estimate the average switching power. Simulation results validate the spectral and energy efficiency of the proposed algorithm. In particular, existing state-of-the-art digital beamformers can be used in the system in conjunction with the BA algorithm as it makes the quantization error negligible in the low-resolution regime.
△ Less
Submitted 5 December, 2017;
originally announced December 2017.
-
Complex Block Floating-Point Format with Box Encoding For Wordlength Reduction in Communication Systems
Authors:
Yeong Foong Choo,
Brian L. Evans,
Alan Gatherer
Abstract:
We propose a new complex block floating-point format to reduce implementation complexity. The new format achieves wordlength reduction by sharing an exponent across the block of samples, and uses box encoding for the shared exponent to reduce quantization error. Arithmetic operations are performed on blocks of samples at time, which can also reduce implementation complexity. For a case study of a…
▽ More
We propose a new complex block floating-point format to reduce implementation complexity. The new format achieves wordlength reduction by sharing an exponent across the block of samples, and uses box encoding for the shared exponent to reduce quantization error. Arithmetic operations are performed on blocks of samples at time, which can also reduce implementation complexity. For a case study of a baseband quadrature amplitude modulation (QAM) transmitter and receiver, we quantify the tradeoffs in signal quality vs. implementation complexity using the new approach to represent IQ samples. Signal quality is measured using error vector magnitude (EVM) in the receiver, and implementation complexity is measured in terms of arithmetic complexity as well as memory allocation and memory input/output rates. The primary contributions of this paper are (1) a complex block floating-point format with box encoding of the shared exponent to reduce quantization error, (2) arithmetic operations using the new complex block floating-point format, and (3) a QAM transceiver case study to quantify signal quality vs. implementation complexity tradeoffs using the new format and arithmetic operations.
△ Less
Submitted 25 October, 2017; v1 submitted 2 May, 2017;
originally announced May 2017.
-
Resolution-Adaptive Hybrid MIMO Architectures for Millimeter Wave Communications
Authors:
Jinseok Choi,
Brian L. Evans,
Alan Gatherer
Abstract:
In this paper, we propose a hybrid analog-digital beamforming architecture with resolution-adaptive ADCs for millimeter wave (mmWave) receivers with large antenna arrays. We adopt array response vectors for the analog combiners and derive ADC bit-allocation (BA) solutions in closed form. The BA solutions reveal that the optimal number of ADC bits is logarithmically proportional to the RF chain's s…
▽ More
In this paper, we propose a hybrid analog-digital beamforming architecture with resolution-adaptive ADCs for millimeter wave (mmWave) receivers with large antenna arrays. We adopt array response vectors for the analog combiners and derive ADC bit-allocation (BA) solutions in closed form. The BA solutions reveal that the optimal number of ADC bits is logarithmically proportional to the RF chain's signal-to-noise ratio raised to the 1/3 power. Using the solutions, two proposed BA algorithms minimize the mean square quantization error of received analog signals under a total ADC power constraint. Contributions of this paper include 1) ADC bit-allocation algorithms to improve communication performance of a hybrid MIMO receiver, 2) approximation of the capacity with the BA algorithm as a function of channels, and 3) a worst-case analysis of the ergodic rate of the proposed MIMO receiver that quantifies system tradeoffs and serves as the lower bound. Simulation results demonstrate that the BA algorithms outperform a fixed-ADC approach in both spectral and energy efficiency, and validate the capacity and ergodic rate formula. For a power constraint equivalent to that of fixed 4-bit ADCs, the revised BA algorithm makes the quantization error negligible while achieving 22% better energy efficiency. Having negligible quantization error allows existing state-of-the-art digital beamformers to be readily applied to the proposed system.
△ Less
Submitted 15 August, 2017; v1 submitted 11 April, 2017;
originally announced April 2017.
-
ADC Bit Allocation under a Power Constraint for MmWave Massive MIMO Communication Receivers
Authors:
Jinseok Choi,
Brian L. Evans,
Alan Gatherer
Abstract:
Millimeter wave (mmWave) systems operating over a wide bandwidth and using a large number of antennas impose a heavy burden on power consumption. In a massive multiple-input multiple-output (MIMO) uplink, analog-to-digital con- verters (ADCs) would be the primary consumer of power in the base station receiver. This paper proposes a bit allocation (BA) method for mmWave multi-user (MU) massive MIMO…
▽ More
Millimeter wave (mmWave) systems operating over a wide bandwidth and using a large number of antennas impose a heavy burden on power consumption. In a massive multiple-input multiple-output (MIMO) uplink, analog-to-digital con- verters (ADCs) would be the primary consumer of power in the base station receiver. This paper proposes a bit allocation (BA) method for mmWave multi-user (MU) massive MIMO systems under a power constraint. We apply ADCs to the outputs of an analog phased array for beamspace projection to exploit mmWave channel sparsity. We relax a mean square quantization error (MSQE) minimization problem and map the closed-form solution to non-negative integer bits at each ADC. In link-level simulations, the proposed method gives better communication performance than conventional low-resolution ADCs for the same or less power. Our contribution is a near optimal low-complexity BA method that minimizes total MSQE under a power constraint.
△ Less
Submitted 24 February, 2017; v1 submitted 16 September, 2016;
originally announced September 2016.
-
Power Control in Two-Tier Femtocell Networks
Authors:
Vikram Chandrasekhar,
Jeffrey G. Andrews,
Tarik Muharemovic,
Zukang Shen,
Alan Gatherer
Abstract:
In a two tier cellular network -- comprised of a central macrocell underlaid with shorter range femtocell hotspots -- cross-tier interference limits overall capacity with universal frequency reuse. To quantify near-far effects with universal frequency reuse, this paper derives a fundamental relation providing the largest feasible cellular Signal-to-Interference-Plus-Noise Ratio (SINR), given any…
▽ More
In a two tier cellular network -- comprised of a central macrocell underlaid with shorter range femtocell hotspots -- cross-tier interference limits overall capacity with universal frequency reuse. To quantify near-far effects with universal frequency reuse, this paper derives a fundamental relation providing the largest feasible cellular Signal-to-Interference-Plus-Noise Ratio (SINR), given any set of feasible femtocell SINRs. We provide a link budget analysis which enables simple and accurate performance insights in a two-tier network. A distributed utility-based SINR adaptation at femtocells is proposed in order to alleviate cross-tier interference at the macrocell from cochannel femtocells. The Foschini-Miljanic (FM) algorithm is a special case of the adaptation. Each femtocell maximizes their individual utility consisting of a SINR based reward less an incurred cost (interference to the macrocell). Numerical results show greater than 30% improvement in mean femtocell SINRs relative to FM. In the event that cross-tier interference prevents a cellular user from obtaining its SINR target, an algorithm is proposed that reduces transmission powers of the strongest femtocell interferers. The algorithm ensures that a cellular user achieves its SINR target even with 100 femtocells/cell-site, and requires a worst case SINR reduction of only 16% at femtocells. These results motivate design of power control schemes requiring minimal network overhead in two-tier networks with shared spectrum.
△ Less
Submitted 13 May, 2009; v1 submitted 21 October, 2008;
originally announced October 2008.
-
Femtocell Networks: A Survey
Authors:
Vikram Chandrasekhar,
Jeffrey Andrews,
Alan Gatherer
Abstract:
The surest way to increase the system capacity of a wireless link is by getting the transmitter and receiver closer to each other, which creates the dual benefits of higher quality links and more spatial reuse. In a network with nomadic users, this inevitably involves deploying more infrastructure, typically in the form of microcells, hotspots, distributed antennas, or relays. A less expensive a…
▽ More
The surest way to increase the system capacity of a wireless link is by getting the transmitter and receiver closer to each other, which creates the dual benefits of higher quality links and more spatial reuse. In a network with nomadic users, this inevitably involves deploying more infrastructure, typically in the form of microcells, hotspots, distributed antennas, or relays. A less expensive alternative is the recent concept of femtocells, also called home base-stations, which are data access points installed by home users get better indoor voice and data coverage. In this article, we overview the technical and business arguments for femtocells, and describe the state-of-the-art on each front. We also describe the technical challenges facing femtocell networks, and give some preliminary ideas for how to overcome them.
△ Less
Submitted 20 September, 2008; v1 submitted 6 March, 2008;
originally announced March 2008.