SAN Fundamentals - 1

FAQS

1. What is UAS? 

The USB Attached SCSI is the computer protocol for transferring data from the USB connected devices through the SCSI commands. This helps us in avoiding the Mass Storage class’s Bulk-Only transports. There are two standards governing the USB + SCSI combination, which are referred to as the USB Attached SCSI and USB Attached SCSI Protocol (UASP).

2. What are the different fields found in the IQN? 

The different fields present in the iSCSI Qualified Name are:

● IQN string representation.
● Date field representing the ownership begin date.
● Reversed domain name of the owner.
● Storage target name representing the exact target that is meant to be connected to, which is optionally done with a ‘:’ mark.

3. What is an initiator? What are the different types of initiators?

Initiator is the client side of the network which is trying to get some data through the iSCSI connected device. This can typically be compared to simulation of the SCSI bus that is meant to transfer the data from the disks. 

There are two different types of initiators:

Software initiator: This refers to the Operating system level or kernel level code that can simulate the SCSI interface to higher level application. The kernel level drivers which act on the TCP/IP data transform them into the raw SCSI commands and data packets.

Hardware Initiator: There is dedicated hardware for having the transformation done from TCP/IP to SCSI based protocols. This avoid the overhead on the processing side from the OS level as these have their own connectivity adapter and firmware meant for deciphering data.

4. What is 8b/10b encoding? What is its use? 

The 8b/10b encoding is concept of mapping 8 bit symbols to 10 bit symbols thus avoiding the problems involving bounded disparity, DC-balance. The transmission can achieve good reliability with clock recovery made possible. The first 3 bits of data are transformed into a 4bit code and the lower 5 bits are coded as 6 bits (5b/6b encoding). Thus each 8 bit information is transferred into a 10 bit line code.

5. What are the advantages of using SATA over ATA?

The SATA is a more advantaged technology than ATA. The hot swapping and hot plugging is possible in SATA and the cost of installation can be minimal in SATA, so is the cable usage and the data speeds are as high as 1.5gbps.

6. What is RAID? What are the advantages of using one? 

The RAID refers to Redundant Array of Independent Disks. This is a mechanism where data is stored in multiple disks and each disk content has another copy lying in another independent disk. This way it is possible to have redundant data which helps us in avoiding loss of data due to malfunction of a single disk in a RAID. The management layer for configuring and using the RAID appropriately is referred to as RAID Controller, which offers interface to RAID access. 

7. What are the FC3 level services? 

The FC-3 level gives the following services: 

● Striping: Single chunk of data can be transmitted parallel across multiple N_ports thus maximising bandwidth utilization. 

● Hunt groups: The ability for multple N_Ports to respond to the incoming requests from specific alias address makes it possible to have an uninterrupted access at times of heavy load.

● This level offers the broadcast services where a single unit of information can be transmitted to be received by multiple N_ports on a FC network.

8. What is flow control? What are the different flow control mechanisms used by the different types of frames? 

Flow control refers to process control mechanism in the FC 2 standard, where the data transmissions across the different N_ports and N_port and the fabrics are coordinated to avoid overflow. The Class 1 type of frames use end-to-end flow control, Class 2 uses the buffer-to-buffer flow control while the class 3 may use either of them.

9. What is an ordered set and what are the uses of the different ordered sets? 

The ordered set is a four byte data that are meant for coordinating the data transfers across the FC networks. 

The 3 different ordered sets and their uses are: 

● The Frame delimiters are the ordered sets that indicate the commencing or end of a particular chunk of data that is being transferred.

● Idle and R_RDY ordered sets are meant for signalling the availability of server and client to send and receive data, so that unintentional loss of data is avoided.

● Primitive sequence ordered set is the one that is used to give out the status of the devices and ports during transmission and reception.

10. What is DAS? What are the advantages of SAN over DAS? 

The DAS refers to the Direct Attached Storage. This is the way of maintaining storage where the physical disks are directly attached to the computers (servers). This is quite feasible solution for a stand-alone server with less data needs and single point of operation. But when it comes to an application or server infrastructure that is dependent on a number of different servers which has to have a very stable backup and redundant data management mechanism, the Storage Area Networks offer a better option. 

The advantages are:

● Multiple point of access.
● Stability of the infrastrucure.
● Avoidance of single point of failure.
● Safeguarding against data loses

11. What is use of the Logical Unit Number? 

The Logical Unit Number or LUN is the unique number that identifies a specific device or a specific volume inside a device that has to be used for a particular read/write operation in a SAN. All the storage protocols like SCSI, iSCSI, FC etc have the storage mounted as different volumes and the accesses to such devices have to be identified with the LUN to specify which volume is under consideration. This comes practically used in all RAID installations and management.

12. What is WWPN and WWNN? Can same WWNN be assigned to different ports? 

The World Wide Port Number is a unique identifier for the FC storage port. This can be accessed from any connection worldwide. This is similar to the MAC addresses used in the IP and ethernet standards. The WWNN stands for the World Wide Node Name which is the unique name assigned to a device connected to FC network. Yes, the same node name can be assigned to multiple ports symbolizing the multiple interfaces to the specific node or device in the network.

13. What is SCSI and iSCSI? 

The SCSI stands for Small Computer System Interface which is data transfer standard for the computer system’s buses connecting the storage disks. The iSCSI is the Internet SCSI which is the internet (TCP/IP) protocol based simulation of the SCSI interface where commands are exchanged just similar to SCSI bus interfaces for fetching data.

14. How can you compare FC with iSCSI?


The FC is the network layer technology that offers higher speeds and reliability in transfer of data in storage systems. Internet SCSI on the other hand is a TCP/IP based protocol that allows for systems to communicate and exchange data over the already existing TCP/IP protocol. Thus the speed of the iSCSI is limited but the FC is one that can give you speeds upto 10 Gigabytes/sec.

15. What is FCoE? 

The Fibre Channel over Ethernet is the technology that combines the merits of Fibre channel and the common data transfer protocol of ethernet. The combination offers the Storage Area Networks a better fibre channel oriented protocol of accesses. The Server-Server and Client-Server connectivity is enhanced the infrastructure is more stable and all the systems operate with the FC protocol which gives a better scope of integrating the latest standards in storage devices.

16. List out the advantages of iSCSI over others?

● There is minimal modifications required to be done on present networks.
● The Servers can just be the normal low cost ones.
● The dependence of the servers over the actual physical storage is taken off.
● More performance from server computation side.
● All types of data storage medium SATA, SCSI etc. can be integrated easily.
● Creating snapshots of data comes in handy during server replacements, as opposed to backups.
● Simpler management and storage enhancements.

17. What are the different topologies and the number of devices for each in case of Fibre Channel? 

The three different topologies and corresponding number of devices that can be connected are: 

● Point to Point topology that can support 2 devices.
● The FC Arbitrated loop that can support 127 devices.
● The Switched fabric topology can support 2^24 devices max.

18. What is the concept of Storage Virtualization?

The usage of a intermediate interface for managing storage logically which gives the end user the independence from the physical storage is called storage virtualization. This gives the abstraction for the applications and other data accesses making it simple and generic for the end user of the data.

19. WHAT ARE THE BENEFITS OF FIBRE CHANNEL SANS?

 Fibre Channel SANs are the de facto standard for storage networking in the corporate data center because they provide exceptional reliability, scalability, consolidation, and performance. Fibre Channel SANs provide significant advantages over direct-attached storage through improved storage utilization, higher data availability, reduced management costs, and highly scalable capacity and performance.

20. WHAT ENVIRONMENT IS MOST SUITABLE FOR FIBRE CHANNEL SANS?

Typically, Fibre Channel SANs are most suitable for large data centers running business-critical data, as well as applications that require high-bandwidth performance such as medical imaging, streaming media, and large databases. Fibre Channel SAN solutions can easily scale to meet the most demanding performance and availability requirements.

21. WHAT CUSTOMER PROBLEMS DO FIBRE CHANNEL SANS SOLVE?

The increased performance of Fibre Channel enables a highly effective backup and recovery approach, including LAN-free and server-free backup models. The result is a faster, more scalable, and more reliable backup and recovery solution. By providing flexible connectivity options and resource sharing, Fibre Channel SANs also greatly reduce the number of physical devices and disparate systems that must be purchased and managed, which can dramatically lower capital expenditures. Heterogeneous SAN management provides a single point of control for all devices on the SAN, lowering costs and freeing personnel to do other tasks.

22. HOW LONG HAS FIBRE CHANNEL BEEN AROUND?

Development started in 1988, ANSI standard approval occurred in 1994, and large deployments began in 1998. Fibre Channel is a mature, safe, and widely deployed solution for high-speed (1 GB, 2 GB, 4 GB) communications and is the foundation for the majority of SAN installations throughout the world.

23. WHAT IS THE FUTURE OF FIBRE CHANNEL SANS?

Fibre Channel is a well-established, widely deployed technology with a proven track record and a very large installed base, particularly in high-performance, business-critical data center environments. Fibre Channel SANs continue to grow and will be enhanced for a long time to come. The reduced costs of Fibre Channel components, the availability of SAN kits, and the next generation of Fibre Channel (4 GB) are helping to fuel that growth. In addition, the Fibre Channel roadmap includes plans to double performance every three years

24. WHAT ARE THE BENEFITS OF 4GB FIBRE CHANNEL?

Benefits include twice the performance with little or no price increase, investment protection with backward compatibility to 2 GB, higher reliability due to fewer SAN components (switch and HBA ports) required, and the ability to replicate, back up, and restore data more quickly. 4 GB Fibre Channel systems are ideally suited for applications that need to quickly transfer large amounts of data such as remote replication across a SAN, streaming video on demand, modeling and rendering, and large databases. 4 GB technology is shipping today

25. HOW IS FIBRE CHANNEL DIFFERENT FROM ISCSI?

Fibre Channel and iSCSI each have a distinct place in the IT infrastructure as SAN alternatives to DAS. Fibre Channel generally provides high performance and high availability for business-critical applications, usually in the corporate data center. In contrast, iSCSI is generally used to provide SANs for business applications in smaller regional or departmental data centers.

26. WHEN SHOULD I DEPLOY FIBRE CHANNEL INSTEAD OF ISCSI?

For environments consisting of high-end servers that require high bandwidth or data center environments with business-critical data, Fibre Channel is a better fit than iSCSI. For environments consisting of many midrange or low-end servers, an IP SAN solution often delivers the most appropriate price/performance.

27. Name some of the SAN topologies?

 Point-to-point, arbitrated loop, and switched fabric topologies

28. What’s the need for separate network for storage why LAN cannot be used?

LAN hardware and operating systems are geared to user traffic, and LANs are tuned for a fast user response to messaging requests.
With a SAN, the storage units can be secured separately from the servers and totally apart from the user network enhancing storage access in data blocks (bulk data transfers), advantageous for server-less backups.

29. What are the advantages of RAID?

“Redundant Array of Inexpensive Disks”
Depending on how we configure the array, we can have the
- data mirrored [RAID 1] (duplicate copies on separate drives)
- striped [RAID 0] (interleaved across several drives), or
- parity protected [RAID 5](extra data written to identify errors).
These can be used in combination to deliver the balance of performance and reliability that the user requires.

30. Define RAID? Which one you feel is good choice?

RAID (Redundant array of Independent Disks) is a technology to achieve redundancy with faster I/O. There are Many Levels of RAID to meet different needs of the customer which are: R0, R1, R3, R4, R5, R10, R6.

Generally customer chooses R5 to achieve better redundancy and speed and it is cost effective.

R0 – Striped set without parity/[Non-Redundant Array].

Provides improved performance and additional storage but no fault tolerance. Any disk failure destroys the array, which becomes more likely with more disks in the array. A single disk failure destroys the entire array because when data is written to a RAID 0 drive, the data is broken into fragments. The number of fragments is dictated by the number of disks in the drive. The fragments are written to their respective disks simultaneously on the same sector. This allows smaller sections of the entire chunk of data to be read off the drive in parallel, giving this type of arrangement huge bandwidth. RAID 0 does not implement error checking so any error is unrecoverable. More disks in the array means higher bandwidth, but greater risk of data loss

R1 - Mirrored set without parity.

Provides fault tolerance from disk errors and failure of all but one of the drives. Increased read performance occurs when using a multi-threaded operating system that supports split seeks, very small performance reduction when writing. Array continues to operate so long as at least one drive is functioning. Using RAID 1 with a separate controller for each disk is sometimes called duplexing.

R3 - Striped set with dedicated parity/Bit interleaved parity.

This mechanism provides an improved performance and fault tolerance similar to RAID 5, but with a dedicated parity disk rather than rotated parity stripes. The single parity disk is a bottle-neck for writing since every write requires updating the parity data. One minor benefit is the dedicated parity disk allows the parity drive to fail and operation will continue without parity or performance penalty.

R4 - Block level parity.

Identical to RAID 3, but does block-level striping instead of byte-level striping. In this setup, files can be distributed between multiple disks. Each disk operates independently which allows I/O requests to be performed in parallel, though data transfer speeds can suffer due to the type of parity. The error detection is achieved through dedicated parity and is stored in a separate, single disk unit.

R5 - Striped set with distributed parity.

Distributed parity requires all drives but one to be present to operate; drive failure requires replacement, but the array is not destroyed by a single drive failure. Upon drive failure, any subsequent reads can be calculated from the distributed parity such that the drive failure is masked from the end user. The array will have data loss in the event of a second drive failure and is vulnerable until the data that was on the failed drive is rebuilt onto a replacement drive.

R6 - Striped set with dual distributed Parity.

Provides fault tolerance from two drive failures; array continues to operate with up to two failed drives. This makes larger RAID groups more practical, especially for high availability systems. This becomes increasingly important because large-capacity drives lengthen the time needed to recover from the failure of a single drive. Single parity RAID levels are vulnerable to data loss until the failed drive is rebuilt: the larger the drive, the longer the rebuild will take. Dual parity gives time to rebuild the array without the data being at risk if one drive, but no more, fails before the rebuild is complete.

31. What is the difference between RAID 0+1 and RAID 1+0?

RAID 0+1 (Mirrored Stripped)

In this RAID level all the data is saved on stripped volumes which are in turn mirrored, so any disk failure saves the data loss but it makes whole stripe unavailable. The key difference from RAID 1+0 is that RAID 0+1 creates a second striped set to mirror a primary striped set. The array continues to operate with one or more drives failed in the same mirror set, but if drives fail on both sides of the mirror the data on the RAID system is lost. In this RAID level if one disk is failed full mirror is marked as inactive and data is saved only one stripped volume.

RAID 1+0 (Stripped Mirrored)

In this RAID level all the data is saved on mirrored volumes which are in turn stripped, so any disk failure saves data loss. The key difference from RAID 0+1 is that RAID 1+0 creates a striped set from a series of mirrored drives. In a failed disk situation RAID 1+0 performs better because all the remaining disks continue to be used. The array can sustain multiple drive losses so long as no mirror loses both its drives.This RAID level is most preferred for high performance and high data protection because rebuilding of RAID 1+0 is less time consuming in comparison to RAID 0+1.

32. When JBOD's are used?

“Just a Bunch of Disks”
It is a collection of disks that share a common connection to the server, but don’t include the mirroring,striping, or parity facilities that RAID systems do, but these capabilities are available with host-based software.

33. Differentiate RAID & JBOD?

RAID: “Redundant Array of Inexpensive Disks”
Fault-tolerant grouping of disks that server sees as a single disk volume
Combination of parity-checking, mirroring, striping
Self-contained, manageable unit of storage

JBOD: “Just a Bunch of Disks”
Drives independently attached to the I/O channel
Scalable, but requires server to manage multiple volumes
Do not provide protection in case of drive failure

34. What is a HBA?

Host bus adapters (HBAs) are needed to connect the server (host) to the storage.

35. What are the advantages of SAN?

Massively extended scalability
Greatly enhanced device connectivity
Storage consolidation
LAN-free backup
Server-less (active-fabric) backup
Server clustering
Heterogeneous data sharing
Disaster recovery - Remote mirroring
While answering people do NOT portray clearly what they mean & what advantages each of them have, which are cost effective & which are to be used for the client's requirements.

36. What is the difference b/w SAN and NAS?

The basic difference between SAN and NAS, SAN is Fabric based and NAS is Ethernet based.
SAN - Storage Area Network
It accesses data on block level and produces space to host in form of disk.
NAS - Network attached Storage
It accesses data on file level and produces space to host in form of shared network folder.

37. What is a typical storage area network consists of - if we consider it for implementation in a small business setup?

If we consider any small business following are essentials components of SAN
- Fabric Switch
- FC Controllers
- JBOD's

38. Can you briefly explain each of these Storage area components?

Fabric Switch: It's a device which interconnects multiple network devices .There are switches starting from 16 port to 32 ports which connect 16 or 32 machine nodes etc. vendors who manufacture these kind of switches are Brocade, McData.

39. What is meant by FC Controllers & JBOD?

FC Controllers: These are Data transfer media they will sit on PCI slots of Server; you can configure Arrays and volumes on it.

JBOD: Just Bunch of Disks is Storage Box, it consists of Enclosure where set of hard-drives are hosted in many combinations such SCSI drives, SAS, FC, SATA.

40. What is the most critical component in SAN?

Each component has its own criticality with respect to business needs of a company.

41. How is a SAN managed?

There are many management software’s used for managing SAN's to name a few
- Santricity
- IBM Tivoli Storage Manager.
- CA Unicenter.
- Veritas Volumemanger.

42. Which one is the Default ID for SCSI HBA?

Generally the default ID for SCSI HBA is 7.
SCSI- Small Computer System Interface
HBA - Host Bus Adaptor

43. What is the highest and lowest priority of SCSI?

There are 16 different ID’s which can be assigned to SCSI device 7, 6, 5, 4, 3, 2, 1, 0, 15, 14, 13, 12, 11, 10, 9, 8.
Highest priority of SCSI is ID 7 and lowest ID is 8.

44. How do you install device drivers for the HBA first time during OS installation?

In some scenarios you are supposed to install Operating System on the drives connected thru SCSI HBA or SCSI RAID Controllers, but most of the OS will not be updated with drivers for those controllers, that time you need to supply drivers externally, if you are installing windows, you need to press F6 during the installation of OS and provide the driver disk or CD which came along with HBA.
If you are installing Linux you need to type "linux dd" for installing any driver.

45. What is Array?
Array is a group of Independent physical disks to configure any Volumes or RAID volumes.

46. Which are the SAN topologies?

SAN can be connected in 3 types which are mentioned below:
Point to Point topology
FC Arbitrated Loop ( FC :Fibre Channel )
Switched Fabric

47. Which are the 4 types of SAN architecture types?

 a. Core-edge
b. Full-Mesh
c. Partial-Mesh
d. Cascade

48. Which command is used in linux to know the driver version of any hardware device?

dmesg

49. How many minimum drives are required to create R5 (RAID 5)?

You need to have at least 3 disk drives to create R5.

50. Can you name some of the states of RAID array?

There are states of RAID arrays that represent the status of the RAID arrays which are given below

a. Online
b. Degraded
c. Rebuilding
d. Failed

51. Name the features of SCSI-3 standard?

QAS: Quick arbitration and selection
Domain Validation
CRC: Cyclic redundancy check

52. Can we assign a hot spare to R0 (RAID 0) array?

No, since R0 is not redundant array, failure of any disks results in failure of the entire array so we cannot rebuild the hot spare for the R0 array.

53. Can you name some of the available tape media types?

There are many types of tape media available to back up the data some of them are

DLT: digital linear tape - technology for tape backup/archive of networks and servers; DLT technology addresses midrange to high-end tape backup requirements.

LTO: linear tape open; a new standard tape format developed by HP, IBM, and Seagate.

AIT: advanced intelligent tape; a helical scan technology developed by Sony for tape backup/archive of networks and servers, specifically addressing midrange to high-end backup requirements.

54. What is HA?

HA High Availability is a technology to achieve failover with very less latency. Its a practical requirement of data centers these days when customers expect the servers to be running 24 hours on all 7 days around the whole 365 days a year - usually referred as 24x7x365. So to achieve this, a redundant infrastructure is created to make sure if one database server or if one app server fails there is a replica Database or Appserver ready to take-over the operations. End customer never experiences any outage when there is a HA network infrastructure.

55. What is virtualization?

Virtualization is logical representation of physical devices. It is the technique of managing and presenting storage devices and resources functionally, regardless of their physical layout or location. Virtualization is the pooling of physical storage from multiple network storage devices into what appears to be a single storage device that is managed from a central console. Storage virtualization is commonly used in a storage area network (SAN). The management of storage devices can be tedious and time-consuming. Storage virtualization helps the storage administrator perform the tasks of backup, archiving, and recovery more easily, and in less time, by disguising the actual complexity of the SAN.

56. Describe in brief the composition of FC Frame?

Start of the Frame locator
Frame header (includes destination id and source id, 24 bytes/6 words)
Data Payload (encapsulate SCSI instruction can be 0-2112 bytes in length)
CRC (error checking, 4 bytes)
End of Frame (1 byte)

57. What is storage virtualization?

Storage virtualization is amalgamation of multiple n/w storage devices into single storage unit.

58. What are the protocols used in physical/datalink and network layer of SAN?

a) Ethernet
b) SCSI
c) Fibre Channel

59. What are the types of disk array used in SAN?

a) JBOD
b) RAID

60. What are different types of protocols used in transportation and session layers of SAN?

a) Fibre Channel Protocol (FCP)
b) Internet SCSI (iSCSI)
c) Fibre Channel IP (FCIP)

61. What is the type of Encoding used in Fibre Channel?

8b/10b, as the encoding technique is able to detect all most all the bit errors.

62. What are the main constrains of SCSI in storage networking?

a) Deployment distance (max. of 25 mts)
b) Number of devices that can be interconnected (16)

63. What is a Fabric?

Interconnection of Fibre Channel Switches