{"id":53,"date":"2009-09-17T00:00:25","date_gmt":"2009-09-17T00:00:25","guid":{"rendered":"http:\/\/bloodforge.com\/?p=53"},"modified":"2020-02-20T02:13:16","modified_gmt":"2020-02-20T02:13:16","slug":"extract-formatted-text-from-excel-cell-with-c-rich-text-format","status":"publish","type":"post","link":"https:\/\/bloodforge.azurewebsites.net\/index.php\/2009\/09\/17\/extract-formatted-text-from-excel-cell-with-c-rich-text-format\/","title":{"rendered":"Extract Formatted Text From Excel Cell With C# (Rich Text Format)"},"content":{"rendered":"\n<p>I was writing an application that needed to convert text in a cell in an Excel workbook to HTML. It is fairly trivial to get formatting for the entire cell, but each individual character in the cell could have different formatting itself, so I needed something more specific than cell-level formatting info.<\/p>\n\n\n\n<p>At first, I started using the&nbsp;<strong>Excel.Range.get_Characters( pos, len )<\/strong>&nbsp;method to get info out of the cell.&nbsp; The code would loop through all characters, get them one by one, and check the formatting.&nbsp; For example:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>Microsoft.Office.Interop.Excel.Range Range = (Microsoft.Office.Interop.Excel.Range)Cell;\nint TextLength = Range.Text.ToString().Length;\nfor (int CharCount = 1; CharCount &lt;= TextLength; CharCount++)\n{\n    Microsoft.Office.Interop.Excel.Characters charToTest = Range.get_Characters(CharCount, 1);\n    bool IsBold = (bool)charToTest.Font.Bold;\n    bool IsItalic = (bool)charToTest.Font.Italic;\n    \/\/ other formatting tests here\n}<\/code><\/pre>\n\n\n\n<p>However, that method proved to be incredibly slow for cells that have more than just a few characters.&nbsp; For cells that have 1000+ characters, it would take several minutes to run the test across all characters. I kept playing around with different ways to speed up the whole process, but it just became apparent that making the call to Excel to get all of this information was not going to be acceptable.<\/p>\n\n\n\n<p>Finally, I think I\u2019ve found the solution. It is possible to copy the text from a cell to the clipboard, and then use the Clipboard class to retrieve the formatted text, and parse it with C#. I ended up using the System.Windows.DataFormats.Rtf format to extract the data from the clipboard in the following way:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>string rtf = (string)System.Windows.Clipboard.GetData(System.Windows.DataFormats.Rtf);<\/code><\/pre>\n\n\n\n<p>Then, I create a System.Windows.Forms.RichTextBox, and use that to parse the data. The following is a sample of the solution, and it is reasonably quick. <\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>Microsoft.Office.Interop.Excel.Range Range = (Microsoft.Office.Interop.Excel.Range)Cell;\nRange.Copy(System.Reflection.Missing.Value);\n            \nstring rtf = (string)System.Windows.Clipboard.GetData(System.Windows.DataFormats.Rtf);\nSystem.Windows.Forms.RichTextBox rtb = new System.Windows.Forms.RichTextBox();\nrtb.Rtf = rtf;\n            \nint CharCount = rtb.Text.Length;\n \nfor (int CharNum = 0; CharNum &lt; CharCount; CharNum++)\n{\n   rtb.Select(CharNum, 1);\n   System.Drawing.Font Font = rtb.SelectionFont;\n   bool IsCharBold = Font.Bold;\n   bool IsCharUnderline = Font.Underline;\n   bool IsCharItalic = Font.Italic;\n\n   \/\/ other code here<\/code><\/pre>\n\n\n\n<p>I was also asked about getting the color in the comments. To get the color, you can use: <\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>System.Drawing.Color color = rtb.SelectionColor;<\/code><\/pre>\n\n\n\n<p>There are also other properties of rtb dealing with selection, such as SelectionAlignment, SelectionBackColor, etc. See the\u00a0<a href=\"http:\/\/msdn.microsoft.com\/en-us\/library\/system.windows.forms.richtextbox(v=vs.110).aspx\">RichTextBox<\/a>\u00a0class for more info. <\/p>\n","protected":false},"excerpt":{"rendered":"<p>I was writing an application that needed to convert text in a cell in an Excel workbook to HTML. It is fairly trivial to get formatting for the entire cell, but each individual character in the cell could have different formatting itself, so I needed something more specific than cell-level formatting info. At first, I [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[4],"tags":[],"_links":{"self":[{"href":"https:\/\/bloodforge.azurewebsites.net\/index.php\/wp-json\/wp\/v2\/posts\/53"}],"collection":[{"href":"https:\/\/bloodforge.azurewebsites.net\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/bloodforge.azurewebsites.net\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/bloodforge.azurewebsites.net\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/bloodforge.azurewebsites.net\/index.php\/wp-json\/wp\/v2\/comments?post=53"}],"version-history":[{"count":1,"href":"https:\/\/bloodforge.azurewebsites.net\/index.php\/wp-json\/wp\/v2\/posts\/53\/revisions"}],"predecessor-version":[{"id":54,"href":"https:\/\/bloodforge.azurewebsites.net\/index.php\/wp-json\/wp\/v2\/posts\/53\/revisions\/54"}],"wp:attachment":[{"href":"https:\/\/bloodforge.azurewebsites.net\/index.php\/wp-json\/wp\/v2\/media?parent=53"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/bloodforge.azurewebsites.net\/index.php\/wp-json\/wp\/v2\/categories?post=53"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/bloodforge.azurewebsites.net\/index.php\/wp-json\/wp\/v2\/tags?post=53"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}